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1 .   INTRODUCTION 


This  thesis  discusses  primarily  the  theoretical  basis  of  a 
verifier  for  sorting  programs  designed  for  use  in  an  automatic  tutor 
for  computer  programming.  A  system,  called  SORTLAB,  has  been  built  em- 
bedding the  sorting  program  verifier.  SORTLAB  allows  a  student  to  write 
programs  for  sorting  an  array,  and  decides  whether  these  programs  are 
correct;  if  they  are  not,  it  generates  counterexamples.   SORTLAB  has 
been  implemented  on  the  PLATO  system  for  computer-aided  instruction 
[Alpert  and  Bitzer  1970],  as  a  part  of  an  Automated  Computer  Science 
Education  System,  ACSES  [Nievergelt  1975]. 

The  development  of  SORTLAB  required  several  different  components: 
in  particular: 

-  a  programming  language  convenient  enough  to  write  programs 
for  sorting  an  array  and  not  necessarily  other  programs, 

-  an  assertion  language  with  just  the  required  expressive  power 
to  assert  the  state  of  an  array  with  respect  to  the  order 

of  its  elements, 

-  special  purpose  techniques  for  verifying  these  programs 
with  assertions,  including  a  theorem  prover  for  the  class  of 
lemmas  generated  by  the  verifier. 
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1 .1  Automatic  Program  Verification 


With  the  increased  concern  for  program  reliability,  the  verifica- 
tion of  programs  is  receiving  greater  attention  than  ever  before.  The 
verification  process  consists  of  checking  if  the  program  meets  its 
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specifications,  namely,   that  it  always  terminates,  and  when  it  does, 
certain  variables  have  a  desired  property  provided  the  input  given  to 
the  program  meets  the  input  specification. 

The  inductive  assertion  method  of  proving  programs  considers 
these  two  problems  separately:     That  the  program  meets  its  input/output 
specifications  is  proven  separately  from  that  of  proving  termination. 
The  method  also  requires  that  an  invariant  property  about  the  program 
variables  be  given  for  e^jery  loop.     Given  these  specifications  and  loop 
assertions,   a  set  of  mathematical   lemmas  are  generated,  which  depend  on 
the  assertions  given,  and  the  semantics  of  the  programming  language 
used.     If  these  lemmas  are  true,   the  correctness  of  the  program  is 
guaranteed;  thus  proving  that  a  program  meets  its  specification  is 
equivalent  to  proving  a  certain  set  of  lemmas.     This  is  the  crux  of  the 
problem. 

Much  of  the  verification  process  is  of  a  yery  mechanical 
nature,  and  unless  a  large  part  of  the  process  is  carried  out  by  the 
computer,  few  programmers  would  be  willing  to  hand- verify  their  programs. 
A  number  of  program  verifiers  have  been  constructed  (see  survey  by 
London  [1972]),  requiring  varying  degrees  of  human  intervention.     However, 
these  are  far  from  being  helpful   to  a  programmer  for  several   reasons. 
Typically,  human  aid  is  required  in  pruning  the  proof-trees.     An  ordinary 
programmer  is  not  trained  in  theorem-proving,  and  is  usually  not  interested 
in  how  these  lemmas  are  proved.     In  addition,  if  the  program  is  incorrect, 
the  verifiers  cannot  provide  assistance  either  by  generating  a  counter- 
example, or  by  pointing  out  where  the  error  lies.     Finally,  the  verifiers 
are  slow  in  operation,  even  for  small   programs.     These  conditions 


combine  to  tempt  a  programmer  test-run  his  programs  rather  than  submit 
them  to  an  automatic  verifier! 

1 .2     Limited  Domain  Program  Verifiers 


The  failure  in  constructing  verifiers  that  are  mechanical   aids 
to  program  writing  can  be  attributed  largely  to  the  ambitious  approach 
taken  in  building  these  verifiers.     Except  for  the  earliest  of  the  veri- 
fiers  [King  1969],   the  others  have  been  increasingly  ambitious  in  the 
variety  of  programs  they  intended  to  verify.     The  wide  scope  of  programs 
being  proven  requires  that  the  programs  be  written  using  elementary  but 
powerful   operations.     Further,  the  assertions,   and  hence,   the  lemmas 
generated  have  to  be  formulated  in  first-order  predicate  calculus   (or 
the  equivalent),  which  is  theoretically  undecidable.     By  increasing  the 
power  of  the  theorem  provers,  we  not  only  make  them  nondecision  procedures, 
but  they  also  lose  a  sense  of  direction  toward  their  goal.     A  large  number 
of  useless  inferences  are  then  generated.     Even  among  the  decidable 
domains  of  problems,   the  theorem  provers  must  be  carefully  designed  in 
order  to  yield  a  decision  procedure  that  works   in  practice.     A  "good" 
theorem  prover  should  prove  a  large  class  of  theorems  that  are  often 
encountered  very  quickly,  while  it  may  take  a  while  to  decide  about 
others. 

A  verifier  and  its  theorem  prover  can  become  simple,  if  they 
incorporate  certain  aspects  of  the  semantics  of  the  problem  domain.     For 
example,  most  well -written  nonnumeric  programs  manipulate  their  data 
structures  in  a  "disciplined  and  uniform"  way,  which  is  as  yet  not 
formally  characterizable.     The  verification  lemmas  arising  from  such 
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programs  seem  to  be  of  a  different  nature  from  those  that  may  arise  in 
ordinary  mathematics,  say,  number  theory.  If  this  is  indeed  the  case, 
the  underlying  formal  system  may  be  decidable.  If  so,  an  incorrect  pro- 
gram may  be  proven  to  be  incorrect,  counterexample  generation  may  be 
feasible,  and  fast  theorem-proving  procedures  may  exist  for  the  specific 
class  of  lemmas. 

Strictly  speaking,  e\/ery   program  verifier  constructed  so  far 
is  a  limited  domain  verifier.  For  example,  the  programs  being  verified 
are  often  limited  to  those  that  operate  on  integer-valued  variables.  But 
we  mean  to  limit  the  domain  even  further.  Some  examples  of  such  domains 
are  programs  operating  on  linear  arrays  with  no  arithmetic,  those  using 
lists,  binary  trees,  etc. 

It  is  doubtful  if  it  would  ever  be  possible  to  construct  suc- 
cessful general  purpose  program  verifiers.  On  the  other  hand,  practical 
verifiers  dealing  with  programs  from  a  limited  domain  of  discourse  can  be 
designed.  This  thesis  provides  one  such  example,  namely,  a  verifier  for 
in-place  sorting  programs  which  is  being  used  in  an  automatic  tutor  of 
computer  programming. 

1 .3  Program  Verification  in  Teaching  Programming 

It  is  important  that  a  student  programmer  realize  the  need  for 
program  reliability.  A  concern  for  the  correctness  of  programs  at  an 
early  stage  in  one's  education  has  great  impact  on  one's  attitudes  to- 
ward programming  in  later  years.  As  exemplified  by  Dijkstra  and  others, 
a  systematic  method  of  designing  abstract  programs  depends  heavily  on 


the  correctness  proofs  of  programs.  The  "elegance"  of  a  program  is 
usually  directly  proportional  to  the  ease  with  which  it  can  be  proven 
correct.  There  can  be  no  question  that  one's  understanding  of  one's  own 
program  is  increased  greatly  after  inventing  the  loop  assertions  for  the 
program.  Quite  often  one  discovers  better  ways  of  writing  the  program. 

In  teaching  programming,  one  would  like  to  supervise  the  pro- 
gram design  process  by  the  student,  as  well  as  examine  thoroughly  the 
finished  product.  Both  these  aspects  are  amenable  to  computerization, 
particularly  if  an  interactive  computer  system  is  available.  The  teacher- 
program  supervising  the  program  design  process  should  be  an  expert  in 
the  programming  problem  domain,  must  have  an  "opinion"  about  various  de- 
sign methodologies,  and,  perhaps  more  importantly,  be  able  to  converse 
with  the  student  in  a  reasonable  language.  If  the  problem  domain  is 
sufficiently  simple,  such  teaching  programs  can  indeed  be  designed.  For 
an  example  of  such  a  system,  see  [Daniel son  1975]. 

On  the  other  hand,  a  teacher  program  examining  the  student's 
finished  program  will  not,  and  should  not,  consider  the  design  process. 
Regardless  of  how  it  was  constructed,  judging  the  program's  correctness 
and  elegance  should  be  its  concern.  The  teacher  program  may,  at  one 
extreme,  simply  test  run  the  student  program,  or,  at  the  other  extreme, 
attempt  to  formally  verify  the  student  program.  Such  teacher  program 
should  contain  at  least  a  program  editor,  a  run-time  system,  a  program 
verifier,  and  a  counterexample  generator. 

Apart  from  these  technical  qualifications  required  of  the 
teacher  programs,  they  should  be  fast  enough  to  give  interactive  response 


to  the  student.  These  considerations  lead  us  to  write  a  special ized  teacher 
program  with  built-in  knowledge  of  a  programming  domain  resulting  in  an 
interactive  programming  laboratory,  SORTLAB,  wherein  the  student  can 
prepare  a  sorting  program,  and  use  the  program  verifier  iteratively  until 
a  correct  program  is  obtained. 

1 .4  The  Sorting  Program  Verifier 
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We  have  chosen  in-place  sorting  as  the  limited  domain  of  dis- 
course in  SORTLAB  because  of  two  main  reasons.  First,  every   program 
verifier  constructed  so  far  has  verified  several  sorting  programs;  their 
authors  quote,  quite  often  exclusively,  these  examples.  This  gives  us 
a  basis  for  comparison.  Secondly,  sorting  programs  are  perhaps  the  most 
used  examples  in  introductory  programming  courses. 

The  verifier  can  actually  prove  any  program.,  sorting  or  not, 
written  in  our  mini-programming  language  and  whose  behavior  can  be 
asserted  in  the  assertion  language.  (See  Sections  2. 2  and  3.1  for  a  des- 
cription of  these  languages.)  If  the  program  is  not  proven  correct, 
then  there  must  be  mathematical  "lemma"(s)  generated  from  the  program 
and  its  assertions  which  are  false.  The  verifier  can  generate  counter- 
examples to  these  lemmas. 
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2.  VERIFIER 


Every  program  operates  on  a  certain  set  of  data  objects  and 
aims  to  produce  an  output  set  of  data  objects  with  desired  properties. 
A  subset  of  these  data  objects,  the  input,  is  given  to  the  program,  and 
the  remaining  data  objects  are  the  result  of  program  execution.  Quite 
often,  the  input  changes  in  its  structure,  data  objects  get  created  or 
destroyed,  their  structure  and  relationships  changeD  The  program  is 
expected  to  realize  a  desired  property  on  the  output  only  if  the  input 
meets  certain  requirements.  To  this  end,  the  programmer  asserts  what 
relationships  are  to  hold  on  the  input  data  objects,  and  what  holds  on 
the  output. 


2.1  Inductive  Assertion  Method  of  Verification 

Given  the  input  and  output  assertions,  say  <J>  and  iJj,  we  are  in- 
terested in  verifying  that  the  program  P  behaves  properly,  i.e.,  whenever  P 
is  given  input  satisfying  <j>,  the  output  satisfies  \p,  if  and  when  P  terminates. 
Notationally,  following  [Manna  and  Pnueli  1974],  let  us  express  this  statement  by: 


{cj>    P    ^} 


(2.1) 


The  program  P  is  said  to  be  partially  correct  with  respect  to  <p   and  \p   if 
(2.1)  is  true.  (Occasionally  we  refer  to  $  and  \p   as  the  entry  and  exit 
assertions  of  P  when  P  is  a  program  segment.)  P  is  said  to  be  totally 
correct,  if  in  addition  to  (2.1)  being  true,  P  always  terminates.  In 
this  thesis  we  will  be  dealing  with  partial  correctness  only,  and 


3 


£ 


•  • 


1  *  ' 

e  i 

3  :1 
*  - 
3  -' 

i ":: 


hence  forth  refer  to  this  simply  as  correctness.  We  shall  use  "prove 
a  program  segment"  as  an  abbreviation  of  "prove  a  program  segment  correct 
with  respect  to  its  entry  and  exit  assertions." 

The  proof  of  (2.1)  is  trivial,  in  theory,  if  the  set  of  data 
objects  satisfying  the  entry  assertion  $  is  finite,  since  the  program 
can  be  checked  separately  on  each  of  these  data  objects.  However,  in 
general  this  set  is  infinite,  and  even  if  it  is  finite,  it  is  usually 
such  a  large  set  that  separate  treatment  of  each  data  object  is  not 
practical. 

One  of  the  most  widely  used  verification  techniques,  the  induc- 
tive assertion  method  [Floyd  1967,  Naur  1966],  divides  the  verification 
process  into  two  phases.  First,  a  set  of  mathematical  lemmas  ("verifi- 
cation conditions")  is  generated,  which,  if  proven,  is  sufficient  to  im- 
ply the  correctness  of  the  program.  The  second  phase  is  the  proof  of  the 
lemmas  thus  generated. 

The  generation  of  the  verification  conditions  is  intimately 
linked  to  the  semantics  of  the  programming  language  being  used,  and  to 
the  structure  of  the  program  at  hand.  The  program  being  proven  is  decom- 
posed into  loop-free  segments  such  that  each  segment  has  an  entry  asser- 
tion and  an  exit  assertion.  Any  computation  performed  by  the  program  is 
then  a  concatenation  of  executions  of  some  selected  segments.  Thus,  if 
each  segment  is  correct  with  respect  to  its  entry  and  exit  assertions,  using 
induction  on  the  number  of  loop  iterations,  it  can  be  proved  that  ewery 
computation  with  input  satisfying  the  input  assertion  of  the  program 
yields  output  satisfying  the  output  assertion  when  the  program  terminates. 


Once  the  validity  of  this  inductive  proof  is  established,  to  prove  a 
given  program,  only  the  decomposed  loop-free  segments  need  be  proven. 
Requiring  each  loop  to  include  an  invariant  assertion  guarantees  the 
decomposition  of  the  program  into  loop-free  segments  each  with  entry 
and  exit  assertions. 

Section  2.3  describes  the  generation  of  these  lemmas.  The  for- 
ward substitution  method  is  touched  upon  only  briefly  as  we  shall  be 
using  the  backward  substitution  methodo  The  programming  and  assertion 
languages  used  are  described  informally  in  Section  2.2;  their  formal 
specification  appears  in  Chapter  5.  Figures  2.1  and  5.2  give  examples  of 
programs  written  in  our  languages 0  A  decision  procedure  for  the  lemmas 
is  presented  in  Chapter  3. 


2.2  The  Programming  and  Assertion  Languages 


In  the  interest  of  developing  a  fast  and  small  verifier,  we 
limit  the  domain  of  programs.  But  the  variety  of  programs  cannot  be 
limited  by  a  programming  language  alone.  As  is  well  known  [McCarthy 
1960],  a  programming  language  rich  enough  to  include  a  successor  function, 
a  conditional,  and  recursion  is  universal  in  the  sense  that  any  recursive 
function  can  be  programmed  in  this  language.  A  programming  language,  by 
imposing  constraints  and  providing  certain  kinds  of  primitive  operations 
while  eliminating  others,  can  only  make  it  wery   inconvenient,  but  not 
impossible,  to  write  certain  programs. 

On  the  other  hand,  an  appropriately  chosen  assertion  language 
can  limit  the  kind  of  programs  that  can  be  asserted  in  that  language.  A 
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1  procedure  sort  (n) 

*  true 

2  scan  down  with  i  from  n  to  2 

3  scan  up  with  j  from  1  to  i-1 

4  if  xj  >  xj+1  then 

5  exchange  xj  with  xj+1 

6  else 

7  end if 

*  1  <  J  <  I  *  N  &  A(1;J)  £  XJ+1  &  A(1;I)  <   S(I+1;N) 

8  endscan 

*  1  <  I  <  N  &  A(1;I-1)  <  S(I;N) 

9  endscan 

*  S(1;N) 

10  endproc 


Abbreviations 

A(  for 

S(  for 

array(s,t)  <  sorted(u,v)  for 


array,( 

sorted( 

array(s,t)  <  array_(u,v)  and 

sorted (u,v) 


Figure  2.1  A  Bubble  Sort  Program  with  Assertions 
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more  general  assertion  language  will  use  elementary,  but  powerful, 
atomic  predicates.  This  use  of  elementary  predicates  makes  it  diffi- 
cult to  lump  together  all  related  predicates.  The  loss  of  power  of  ex- 
pression in  a  limited  assertion  language  is  compensated  for  by  the  large 
and  recognizable  chunks  of  properties  in  the  assertion.  Further,  while 
it  has  been  advocated  that  theorem  provers  make  large  inferences >  it 
appears  necessary  that  related  information  should  be  recognizable  as 
such  before  large  inferences  can  be  made. 

These  considerations  led  us  to  design  a  mini-programming  lan- 
guage and  an  assertion  language  which  are  specific  to  the  sorting  of 
arrays.  Formal  specification  of  the  language  is  given  in  Chapter  5. 
Below  we  touch  upon  only  the  salient  features. 

2.2.1  Programming  Language 

Operations  on  Keys 

It  is  a  well-recognized  principle  in  program  design  that  basic 
procedures,  specific  to  the  particular  problem  and  the  data  structures 
being  used,  should  be  developed  and  used  so  that  data  integrity  may  be 
preserved  [Dahl  et  al .  1972],  Sorting  programs  must  conserve  the  keys 
they  are  sorting.  Hence,  we  provide  two  basic  operations:  exchange  and 
insertion  of  keys,  and  forbid  value  assignments  to  the  keys  of  the  array. 
This  guarantees  that  the  elements  of  the  array  are  conserved  throughout 
the  program.  Therefore,  our  verifier  need  only  prove  that  the  array  is 
sorted. 
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Operations  on  Array  Indices 

Successor  and  predecessor  functions  on  the  indices  ("ptrs") 
of  the  array  provide  sequential  access  to  the  elements.  A  ptr  variable 
may  be  assigned  the  value  of  a  ptr  expression,  which  is  of  the  form 
<ptr  variable>  +  <integer  constants 
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Procedures 

In  our  verification  system,  we  assume  that  the  "intention"  of 
any  procedure  is  to  produce  a  certain  permutation  of  the  elements  of  the 
array  x,  which  is  global  to  all  procedures.  Most  procedures,  however, 
permute  only  the  elements  belonging  to  a  certain  contiguous  segment,  say 
array  (a,b);  thus,  we  require  that  each  procedure  have  exactly  two  input 
(formal)  ptr  variables  a,  b.  All  elements  not  belonging  to  array  (a,b) 
are  made  "read  only"  to  this  procedure;  no  such  element  is  permitted  to 
participate  in  any  exchange  or  insert  operation.  "Guard  expressions" 
are  provided  for  this  (see  also  [Marmier  1975],  p.  57). 

For  simplicity,  we  insist  that  the  formal  ptr  variables  a,  b  be 
not  subject  to  assignment  (i.e.,  they  may  not  appear  on  the  left  hand 
side  of  any  assignment).  The  two  variables  must,  of  course,  be  distinct., 
Optionally,  a  procedure  may  return  ptr  results  to  the  calling  procedure. 
There  are  no  global  ptr  variables.  An  entry  and  an  exit  assertion  for 
the  body  of  the  procedure  must  be  given. 


Procedure  Calls 


A  procedure  call  must  contain  two  ptr  expressions  as  the  actual 
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input  parameters  for  the  procedure  called.  If  the  called  procedure  has 
output  parameters,  the  call  must  receive  these  results  in  distinct 
ptr  variables.  We  further  insist  that  these  variables  be  distinct 
from  those  appearing  in  the  input  ptr  expressions;  this  is  done  for 
the  sake  of  simplicity. 

For  each  call  statement,  we  require  that  an  entry  assertion 
to  the  call  be  given.  Thus,  the  user  must  give  not  only  the  loop  in- 
variants, entry  and  exit  assertions  for  procedures,  but  also  an  entry 
assertion  for  each  call  (see  Section  6.1.2  for  a  related  discussion). 

Control  Structures 


In  addition  to  the  familiar  vf  and  while  statements,  a  scan 
statement  (similar  to  the  for  statement  of  other  languages)  is  provided. 
The  loop  variable  of  scan,  however,  is  not  considered  "unmodifiable"  by 
its  body. 

2.2.2  Assertion  Language 


i 

MO 


Array  Predicates 

When  sorting  arrays  with  sequential  access,  we  generally  need 
atomic  predicates  to  indicate  that: 


1.  The  array  segment  from  index  s  to  index  t  is  sorted 
=   if  s  <  i  <  j  <  t  then  x-  <  x.,  and 


■  {« 
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2.  Elements  of  the  segment  from  s  to  5  are  all  less  than  any 
element  of  the  segment  from  u  to  v 
=  if  s  <  i  S  t   and  u  £  j  <  v  then  x-  <  x. 

where  x  is  the  name  of  the  array,  and  an  index  is  a  ptr  expression. 
These  array  predicates  are  abbreviated  as: 

sorted  (s,t) 
and 

array  (s,t)  <  array  (u,v) 
respectively.  The  segments  are  defined  by  their  lower  boundaries  s,  u 
and  the  upper  boundaries  t,  v. 

Ptr  Predicates 


Predicates  relating  the  indices  of  the  array  will  also  be 


needed: 


ptr  i  is  at  least  c  units  below  j  =  i  +  c  <  j 

where  i,  j  are  ptr  variables  and  c  an  integer  constant,  and  i  +  c  is  a 
ptr  expression. 

Assertions 


An  assertion,  then,  is  a  sentence  formed  of  these  basic  predi 
cates  and  the  logical  connectives  and  and  or.  Notice  the  absence  of 
negation  in  this  language  which  makes  it  impossible  to  assert  that  an 
array  is  NOT  sorted.  However,  ptr  predicates,  e.g.,  i  +  c  <  j,  can  be 
negated  as  j  +  (1-c)  <  i.  Henceforth,  the  assertions  will  be  written 
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informally;  e.g.,   i  +  c  >  j  rather  than  j  +  1   -  c  <  i  or  i   =  j   rather 
than  1   +  0  <  j  and  j  +  0  <  i . 

2.3     Verification  Condition  Generator 

We  first  discuss  the  generation  of  verification  conditions  of 
a  simple  loop  program  segment  W  with  a  loop-free  body  S. 


W:     while  B  do 


endwhile 


(2.2) 


This  is  then  generalized  to  cover  arbitrary  procedures.     Two  general 
methods  for  the  generation  of  verification  conditions  are  forward  sub- 
stitution and  backward  substitution  [King  1969]. 
2.3.1     Forward  Substitution 

Let  <j>w  be  the  entry  assertion  of  W,  and  <(>,.  be  the  entry  asser- 
tion of  S.     We  then  symbolically  execute  S  on  <j>s  to  obtain  an  assertion 
Sf(<J>c).     Then  the  exit  assertion  of  W  is  generated  as 


VIM 

Eg 


<J>W  and  not  B  or_  Sf  (<j>_)  and  not  B 


(2.3) 


The  lemmas  to  be  proven  are 


<f>u  and  B  logically  impl  ies  4>~ 

Sf (<j>-)  and  B  logically  implies  cj><~ 


(2.4) 
(2.5) 


I 
An 
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Proving  (2.4)  and  (2.5)  guarantees  that  the  entry  assertion  <j>~  of  S 
will   be  true  each  time  S  is  entered. 

The  assertion  Sf(<(>s)  can  be  obtained  by  forward  substitution  as 
follows:     If  S  is  empty  then  Sf(<t>s)is  the  same  as  <(>-.     Otherwise,  let 
S  be  a  concatenation  of  SI  and  S2,  where  S2  may  be  empty.     Then  we  obtain 
Sf(<J>s)   by  recursively  applying  rules  Fl   and  F2  defined  as  follows: 
Rule  Fl    (applicable  iff  SI   is  an  assignment  statement) 

Let  SI   be  u  +■  t  where  t  is  an  expression 

then  Sf (cf>s)   is 


S2f  (subst    u     for  u  in  <f><0  and  u  =  (subst  u    for  u  in  t) 


(2.6) 


Rule  F2  (applicable  iff  SI   is  an  If  statement) 
Let  SI   be 

if  Bl 
then  S3 
else  S4 
endif 


Then  Sf(<j>s)   is 


S2f(S3f  (W  and  Bl )  or  S4f(W  and  not  Bl)) 


(2.7) 


where  subst  y  for  z  jn^  F  stands  for  the  expression  obtained  by  substitu- 

i 
ting  y  for  all  occurrences  of  z  in  the  expression  F.  The  variable  u 

refers  to  the  previous  value  (before  SI)  of  the  variable  u;  thus  (2.6) 
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asserts  the  existence  of  a  value  for  u  which  satisfied  <j>»  prior  to  SI 
which  is  to  be  used  in  the  expression  t.  This  introduction  of  existen- 
tial quantifiers  causes  certain  technical  difficulties  to  our  theorem 
prover  (see  Chapter  3).  Hence  we  have  chosen  to  abandon  forward  substi- 
tution, even  though  it  seems  appealing  due  to  its  close  association  with 
ordinary  execution  of  programs,  and  to  adopt  the  backward  substitution 
method,  which  does  not  introduce  any  quantifiers. 

2.3.2  Backward  Substitution 

Without  loss  of  generality,  let  the  given  loop  invariant  be  the 

exit  assertion  of  the  loop  body  S.  Given  the  exit  assertion  \p~   of  S, 

we  generate  an  entry  assertion  <j>_  such  that  {<j>~   S   ^<.}.  It  should  be 

noted  that  several  such  assertions  <j>s  exist,  one  of  them  being  the  trivial 

false.  However,  the  <j>-  generated  by  backward  substitution  is  such  that, 

for  any  <j>  ,  if  {<)>    S   ^s>  then  c}>  logically  implies  <(><.. 

Now  let  tyM  be  the  exit  assertion  of  W,  and  \p ..   be  the  exit  as- 
w  o 

sertion  of  the  loop  body  S.  We  can  symbolically  "unwind"  the  execution 
in  the  backward  direction  and  obtain  the  entry  assertion  of  S  as  <j)~  = 
Sb(^s).  Then  the  entry  assertion  of  W  is 


i  hi* 


^u  -  ^s  and  B  PJ2  ^u  and  not  B 


(2.8) 


Proving  the  lemmas 


and 


^  and     B  logically  implies  Sb(ip^) 


^S  and  not  B  logically  implies  & 


(2.9) 


(2.10) 


18 


•.Ik  " 


• 


ft.ii 

■an 
it 


guarantees  that  the  entry  assertion  $~  =  Sb(xps)  of  S  will  be  true  each 
time  S  is  entered,  and  that  the  exit  assertion  ih,  of  W  holds,  when  the 
while-loop  is  exited. 

The  assertion  Sb(\^s)  is  obtained  by  backward  substitution  as 
follows:  Let  S  be  a  concatenation  of  SI  and  S2,  S  =   SI;  S2  where  S2  may 
be  emptyc  We  consider  two  cases. 

SI  is  not  a  call  statement 

We  recursively  apply  rules  Bl  and  B2  to  obtain  Sb(ips) . 

Rule  Bl  (applicable  iff  SI  is  either  a  ptr-assignment,  exchange,  or  insert 
statement) 

Bl.l:  Let  SI  be  u  «-  t  where  t  is  a  ptr  expression 


Then  <J>S  is 


Sb(ij;s)  =  subst  t  for  u  in_  S2b(\ps) 


(2.11) 


£  i 
t 


B1.2:  Let  SI  be  exchange  x  with  xb  where  a,  b  are  ptr  ex- 
pressions. Then, 


Sb(iJv)  =   exchb  x3  with  xb  in  S2b(ifrs)  (2.12) 


B1.3:  Let  SI  be  insert  x  below  x,  where  a,b  are  ptr- 
expressions.  Then, 


Sb(ify)  =  nsrtb  xa  below  xb  in  S2b(^s)  (2.13) 


We  postpone  (to  Chapter  3)  an  accurate  description  of  these  inverse  func- 
tions exchb  .  .  .,  and  nsrtb.  .  . ,  as  their  evaluation  plays  an  important 


19 


role  in  our  theorem  proving.  Intuitively,  these  functions  reproduce  the 
situation(s)  which  must  have  existed  prior  to  the  exchange  or  insert. 

Rule  B2  (applicable  iff  SI  is  an  vf- statement) 

Let  SI  =  If  Bl  then 

S3 
else 

S4 
endif 


Then 


Sb(i|/S)  =  S3b(S2b(^s))  and  Bl 

or  S4b(S2b(ips) )  and^  not  Bl 


SI  is  a  call  statement 


Let  the  call  statement  and  called  procedure  be  as  shown  below: 


*  a 

call  Q(s,t)  :  (u,v) 

*  3 


procedure  Q(a,b)  :  (c,d) 

[procedure  body] 
Q 


*  ^ 


endproc 

where  a  and  b  are  the  two  distinct  input  variables  of  procedure  Q  re- 
ceiving values  from  the  ptr  expressions  s  and  t  respectively;  the  lists 
given  after  ":"  are  the  output  parameters;  a  is  the  given  entry  assertion 
to  the  call  statement,  and  3  is  the  generated  exit  assertion  of  call 


.urn 
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looking  at  the  statements  below  call ;  <j>0,  ipn  are  the  given  entry  and  exit 
assertions  for  the  procedure  Q. 

We  should  prove  the  following  two  lemmas: 


a  logically  implies  <j> 


Q 


(2.16) 


■  i: 
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t 


if 

S  inn 
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unmod 


ified  parts  of  a  wrt  (s,t)  and  \\jq  logically  implies  3       (2.17) 


where  <|>n>  ip  ,   unmodified  parts  of  a  are  obtained  from  4>n,  ^n,  and  a  as 

H         Q  WW 

described  below. 

The  entry  assertion  <J>n  should  not  have  any  ptr  variables  other 
than  a  or  b  because  these  are  not  defined  on  entry.  We  substitute  in 
cJ)Q  the  expressions  s  and  t  for  a  and  b,  resulting  in  (J)f. 

The  exit  assertion  i^n  should  not  have  ptr  variables  other  than 
a  and  b  or  those  contained  in  ptr  expressions  c  and  d.  We  substitute 
in  \pQS   s,  t,  u  and  v  respectively  for  a,  b,  c  and  d  to  obtain  ^Q.  Note 
that  the  ptr  variables  u  and  v  are  substituted  for  expressions  c  and  d. 
Also  note  that  if  c  or  d  contains  either  a  or  b  then  the  ptr  variables  u 
or  v  will  be  equal  to  an  expression  involving  s  or  t.  The  substitution 
of  s  and  t  for  a  and  b  is  valid  because  the  variables  a  and  b  are  not 
subject  to  assignment  in  procedure  Q. 

Recall  that  procedure  Q  can  permute  elements  belonging  to  the 
segment  array  (a,b).  Thus,  those  predicates  of  a  not  including  segments 
which  are  strict  subsegments  of  array  (s,t)  will  still  be  true  upon  exit 
from  Q.  Such  predicates  are  collected  together  in  unmodified  parts  of  a 
wrt  (s,t).  We  postpone  the  description  of  this  function  to  Section  3.3.2, 
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When  the  program  consists  of  more  than  one  procedure,  we  must 
prove  the  lemmas  (2.16)  and  (2.17)  for  each  procedure  call,  and  further 
prove  that  the  called  procedures  meet  their  specifications. 

To  generate  the  entry  assertion  for  the  body  of  a  given  pro- 
cedure, we  begin  at  the  bottommost  and  innermost  loop  and  successively 
generate  the  entry  assertions  of  loops  as  described  above.  If  <j)p  is  the 
generated  entry  assertion  of  the  procedure  body  P,  and  $  is  the  given 
entry  assertion,  we  should  prove,  in  addition  to  the  lemmas  generated 
for  each  loop  as  in  (2.9)  and  (2.10),  the  lemma 


(J)  logically  implies  <j>p 


(2.15) 


Clearly,  the  proof  of  all   these  lemmas  guarantees  that 


{<J>  |  P   |  ip} 


An  example  of  lemma  generation  appears  in  Figures  2.2  and  2.3. 


(2.1) 
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1 

procedure  sort  (n) 

* 

TRUE 

2 

i  «-  n 

3 

while  i   >_  2  do 

4 

j  -  1 

5 

while  j   <_  i-1   do 

6 

if  xj  >  xj+1   then 

7 

exchange  xj  with  xj+1 

8 

else 

9 

en  di  f 

* 

1  1  J  <   I  <  N  &  A(1;J)   <_XJ+1    &  A(1;I) 

<  S(I+1;N) 

10 

endwhile 

* 

1    <I<NJ  A(1;I-1)   <_  S(I;N) 

11 

endwhile 

* 

S(1;N) 

12 

en  dp  roc 

t 
itii 

til 


■3 


The  assertions  10*  and  12*  of  this  program  are  related  to  7*  and  8* 
of  Figure  2.1  as  follows: 


10*  =  subst  j  -  1  for  j  in  7* 


12*  =  subst  i+1  for  i  in  8* 


Figure  2.2  Rubble  Sort  Program  of  Figure  2.1  Rewritten  using  while- 
Statements  Instead  of  scan  s. 
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The  verification  conditions  for  the  program  in  Figure  2.2  are: 

-  for  loop  at  5: 

10*  and  j  <  i  logically  implies  stmts[6  .  .10]b(10*)     (VI) 

10*  and  j  >  i  logically  implies  subst  i-1  for  i  vn_  12*    (V2) 

where 

stmts[6.  .  .I0]b(10*)  e  the  generated  entry  assertion  of  body 

of  loop  5. 

=  X-;  -  x-_li  and  9*  or  x.  >  x.,,  and  exchb  x-  with  x.^  in  9* 
J    j+1  —  j    J  +  1 j  j+1  — 

where 

9*   subst  j+1  for  j  in_  10* 

-  for  loop  at  3: 

12*  and  i  >  1  logically  implies  stmts[4.  .  .12]b(12*)  (V3) 

12*  and  i  <  1  logically  implies  sorted(1,n)  (V4) 

where 

stmts[4.  .  .12]b(12*)  e  subst  1  for  j  jm 

(subst  i-1  for  i  jn_  12*)  and  j  >  i 

or  stmts[6.  .  .10]b(10*)  and  j  <  i 

-  and  for  proc  body: 

true  logically  implies  subst  n  for  i  in  ( V5 ) 

sortedQ  ,n)  and  i  <  1 
or  stmts[4.  .  .12]b(12*)  and  i  >  1 


1 


Figure  2.3  The  Five  Verification  Conditions  Generated  for  Bubble  Sort 
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3.  THEOREM  PROVER 

In  this  chapter,  we  describe  a  procedure  for  proving  or  disprov- 
ing a  theorem  whose  premise  and  conclusion  are  augmented  well -formed  for- 
mulas. Well -formed  formulas  (wffs)  are  the  sentences  of  the  assertion 
language  (see  Chapter  5).  Augmented  wffs  involve  the  functions  subst, 
exchb  and  nsrtb  which  map  a  pair  consisting  of  a wf f  and  programming  language 
statement  into  a  wff.  The  details  of  these  mappings  will  be  given  in 
Section  3.3.  The  proving  or  disproving  of  a  theorem  is  done  in  two  phases: 
in  the  first  phase,  the  augmented  wffs  are  converted  into  wffs;  in  the  second 
phase,  the  actual  proof  begins.  We  discuss  the  second  phase  first,  as  the 
evaluation  of  the  above  functions  involves  the  concept  of  partitioning  which 
is  an  important  part  of  the  second  phase. 

The  basic  theorem  prover  is  described  in  Section  3.1.  A  proof 
that  this  basic  theorem  prover  is  a  decision  procedure  for  theorems  stated  as 
wffs  is  given  in  Section  3.2.  The  theorem  prover  is  then  extended  (in  Sec- 
tion 3.3)  to  include  evaluation  of  the  aforementioned  functions.  The  chap- 
ter concludes  with  Section  3.5  where  the  generation  of  counterexamples  is 
discussed. 

The  treatment  will  be  informal.  The  level  of  rigor  in  the  proofs 
is  comparable  to  that  generally  found  when  discussing  combinatorial  algo- 
rithms. Several  "remarks"  are  made  soon  after  describing  a  procedure.  These 
are  meta-lemmas  describing  the  properties  of  the  verification  system.  We 
omit  the  proofs  of  these  remarks  as  they  are  neither  illuminating  nor  in- 
teresting. 
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3.1  Basic  Theorem  Prover 

A  theorem  prover  attempts  to  prove  that  a  given  conclusion  oo  fol- 
lows from  a  certain  hypothesis  or  premise  n.  If  oo  does  not  follow  from  n, 
a  general  theorem  prover  may  not  always  halt  and  say  so.  However,  if  Q,   and 
a)  are  sentences  of  a  properly  chosen  assertion  language,  it  is  possible,  as 
we  demonstrate  in  this  chapter  for  our  assertion  language,  to  give  a  deci- 
sion procedure  for  the  question:  Does  Q.   logically  imply  oj?  We  construct  a 
"most  general  model"  for  U,   and  then  determine  if  oo  is  "true"  in  this  model, 
If  oo  is  indeed  satisfied  by  this  model,  then  fi  logically  implies  oj;  other- 
wise, we  will  be  able  to  produce  a  counterexample  which  gives  specific 
values  to  the  variables  making  oo  false  and  n  true.  To  make  the  discussion 
more  precise,  we  will  need  the  following  definitions. 

3.1.1  Definition 


Def  1  An  interpretation  of  a  wff  is  a  mapping  of  the  set  of  all  ptrs, 
constants  and  the  elements  of  the  array  into  the  set  of  integers. 
The  ptr  constants  0,  1,  2,  .  .  .,  and  the  function  symbols  +,  -, 
relation  symbols  <,  <,  >,  >,  =,  =)=  are  given  the  usual  meaning. 
The  key  function  x  maps  the  ptr  expressions  into  key  elements 

We  have  taken  the  set  of  integers  as  the  universe  for  the  keys  of  the  array 
only  for  the  sake  of  simplicity  in  the  ensuing  discussion.  However,  any 
set  of  keys  on  which  a  linear  ordering  is  defined  will  do  for  this  universe, 
The  domain  of  ptr  values  can  similarly  be  enlarged. 
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Def  2  (Truth  of  Predicates): 

The  ptr  predicates  are  interpreted  as  relations  on  integers 
in  the  conventional  way. 

The  array  predicate  array  (s,t)  <  array  (u,v)  is  true  if  either  of 
the  arrays  is  empty  (i.e.,  s  >  t  or  u  >  v),  or  if  no  element  of 
array  (s,t)  is  greater  than  any  element  of  array  (u,v).  (Similar 
meanings  are  given  to  <,  >,  and  >  relations  between  array  segments.) 

The  array  predicate  sorted  (s,t)  is  true  either  if  the  array  seg- 
ment array  (s,t)  is  empty,  or  if  the  elements  of  the  segment  are 
arranged  in  nondecreasing  order  from  the  lower  boundary  s  to  the 
upper  boundary  t  of  the  segment. 

A  disjunct  is  of  the  form  p,  and  p?  and  .  .  .  where  each  p.  is  a  predicate. 
A  wff  is  of  the  form  d,  or  dp  or  .  .  .  where  each  d.  is  a  disjunct.  The 
logical  connectives  and  and  or  are  interpreted  in  the  conventional  way. 

Def  3  An  interpretation  M  is  said  to  satisfy  a  wff  o>,  notationally  k.  <j> 
if  <j>  is  true  in  M.  The  interpretation  M  is  then  a  model  for  <j>. 

Def  4  A  wff  ti   logically  implies  a  wff  w,  notationally  Q  f=  u,  if  oj  is 
satisfied  by  every  model  of  n. 

Def  5  A  wff  <f>,  is  equivalent  to  <j>?>  o),  (==|  d)2>  if  <j>,  f=  <j>2  and  $«  Mi- 

3.1.2  Outline  of  Theorem  Prover 


The  general  strategy  of  the  proof  procedure  is  as  follows:  The 
wffs  n,  and  w  are  in  disjunctive  normal  form.  If  Q   =  n ,  or  Q,~   o_r  .  .  . 


27 


then  the  proof  of  ft  |=  co  is  a  collection  of  proofs  of  ft.  (=  oo. 

ft  |=  a) 
/ 


•  •  • 


■  •  • 


ft,  (=  co     and     ft?  f=  oo     and 


ft.  (= 


CO 


Given  a  disjunct  ft,,  and  the  conclusion  co  to  be  made  from  it,  we  construct 
ft,  from  ft,  using  certain  "inference  rules."  (For  our  purposes,  an  inference 
rule  is  a  procedure  which  transforms  a  given  wff  4>,  according  to  certain  cri- 
teria,  into  a  wff  <j>  which  is  in  a  more  convenient  form  than  <j>.)  The  wff 
ft,  represents  all  that  can  be  "deduced"  out  of  the  facts  given  by  ft,,  and 

is  equivalent  to  ft,.  However,  ft,  is  not  necessarily  a  single  disjunct. 

*    *  * 

Thus  for  each  disjunct  ft  .  of  ft,,  we  "normalize"  ft,,  and  co  so  that  both  wffs 

use  not  only  the  same  set  of  ptrs  but  also  use  the  same  set  of  array 

segments.  This  may  require  "partitioning"  the  array  segments  they  originally 

#1 
referred  to  into  smaller  segments.  The  wff  co  is  rewritten  as  co   using  these 

smaller  segments.  Thus, 


:■■> 


ft,     f=    CO 


ft,     |=    00 


a*u  H  uo#1 


and 


.#i 


*  #2 

ft  „  f=  co        and 


As  we  shall  see  later  co   is  equivalent  to  co  in  the  context  of  ft.. 

M  li 

further  have  the  property  that  if  ft,,  is  satisfiable  then 


The  ft, .'s 
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°11  f"  P       iff  P  Is  P 


(3.1) 


where  P  is  a  particular  predicate  of  ft,,  that  "corresponds"  to  the  predicate 


#i  #1 

p  of  oj     .     Thus,  if  a)      is  a  disjunct, 


«11  •" 


#1 


GJ 


A* 


:a>iM 


&ii  r  Pn  and  ^-|  r  p-|2 


'n 


Hp 


n 


'12 


hp 


12 


#1 
where  u>     =  p,,  and^  p,„  anjd 


#1 


#1 


#1 


#1 


If  oo      is  not  a  single  disjunct,  let  it  be  oj,     or  oj„     where  oj, 

*  #1  *  #1 

is  a  single  disjunct.     Clearly,  if  ft,,  f  oj,     then  ft,,   f=  oj     .     Otherwise,  let 

#1  *  * 

p..  be  a  predicate  of  oj,     which  is  not  implied  by  ft,,:     ft, ,  f  p.  ..     We  then 

consider  two  cases  of  the  premise: 


ftnh 


#1 


U) 


*  #1  #1  *  #1 

ft, ,  and  p .  .  (=  oj,     or  w2      and    ft, ,  and  not  p .  .  (=  ou 


We  now  take  the  transitive  closure  of  the  new  premises  ft,,   and  p..  to  insure 
that  property  (3.1)  mentioned  above  holds,  and  repeat  the  whole  process  for 

each  of  the  new  premises. 

*  # 

An  example  of  ft,  oj  and  ft    and  oj    is  given  in  Figure  3.1. 

3ol.3     Inference  Rules 


7c 

We  now  describe  a  procedure  for  generating  the  aforementioned  ft.. . 


Ji 


and  oj      from  ft  and  oj.     We  will  assume  that  ft  and  oj  are  disjuncts.     The  more 
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ft  =  2  <  i  +  1   <  n  and  sorted(l,i)  and  sorted(i+l ,n) 

co  =  array(l  ,i )  <  array(i+l,n)  or  x.   >  x.+. 

array(l,n)   is  partitioned  into 

array (1  ,i-l);  array (i,i); 

array ( i +1  ,i+l );  array (i+2,n) 

ft     =  2  £  i  and  i  +  2  £  n  and  4  <.  n  and 

sorted(l ,i-l)  <  x.  and  x.(,   <  sorted(i+2,n) 

#1    .     #1  rt„     #1  ,  .    ^ 

co       =  co,     or  co?     where 

#1 
co,         array(l,i-l)  <  array(i+2,n)  and  x.  S  arra 

array(l,i-l)   <  xi+1   and  xi   <  x^^1 

y(i+2,n)   and 

#2   .. 
wl      =  xi   >  xi+l 

a  i                                       * 
The  predicate  x.   <  x.+,  of  cof     is  not  implied  by  ft  . 

The  two  new  premises  are: 

*                                                            * 
ft     and  x.  <  x.,,                                ft     and  x.   >  x- xl 

* 
The  transitive  closure  of  ft    and  x.  <  x.+,   is: 

2  <  i   and  i  +  2  <  n  and  4  <  n  and 

sorted(l ,i-l)  <  x.   and  x.+,   <  sorted(i+2,n)  and 

sorted(l  ,i-l)  <  x. ,,   and  x.  <  sorted(i+2,n)  and 

sorted(l  ,i-l)  <  sorted(i+2,n)  and  x.  <,  x. ,-, 

The  predicate  sorted(l ,i-l )  £  sorted(i+2,n)  of  above 

"corresponds"  to 

the  predicate  array(l,i-l)  <.  array(i+2,n)  of  co?  . 

IIW 

m 
urn 

•' 

ISA 
inn 

"■! 


*  # 

Figure  3.1     An  Example  of  ft,  co,  ft     and  co 
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general  case  will  be  dealt  with  later  (see  Algorithms  1,  2,  3  and  4  of 
Section  3.2).  The  procedure  consists  of  several  subprocedures  (infer- 
ence rules)  each  performing  a  distinct  transformation  on  a  subset  of 
predicates  of  fi  and  u)„  We  will  find  the  following  descriptions  of  the 
effect  of  applying  the  inference  rules  useful.  We  are  interested  only 
in  "sound"  inference  rules: 


a 


«C 


c 


• 


91 


Canii 
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Willi 
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w 
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Def  6    An  inference  rule  r  is  sound  if  it  yields  <j>    when  applied  on 
(J>,  notationally  <j>  y  4>  ,  such  that  <}>  (=  <j>  . 

Uef  7     An  inference  rule  r  is  information  preserving  iff 
{<$>  ^-4>     implies  (J)  h=U  ). 

Intuitively,  no  information  carried  by  <J>  has  been  lost  in  the  process  of 
applying  an  information  preserving  rule0 

Not  all   inference  rules  are  information  preserving.     For  example, 
our  rule  of  local   implication  (Section  3.1  „ 3)  lets  us  conclude  that 
(u  +  kp  -  v)   from  (u  +  k-,   <  v)  whenever  k2  <  k, .     Clearly,  this  rule  is 
not  information  preserving. 


Def  8    An  inference  rule  r  is  an  enriching  rule  if  it  yields  <j>    when 
applied  on  <j>  such  that 

1.  r  is  information  preserving,  and 

2.  For  e\/ery  aR  b  of  <j>  ,   consider  the  predicates  aRb  of  <j> 
on  the  same  variables  a  and  b.     Then  aR  b  (=  aRb  but  not 
necessarily  aRb  |=  aR  b. 
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Note  that  an  enriching  rule  does  not  actually  create  new  information, 
but  rather  makes  whatever  information  was  present  more  readily  usable. 
All  our  inference  rules,  except  the  abovementioned  rule  of  local  implica- 
tion, are  enriching  rules. 

It  will  be  convenient  to  describe  the  rules  on  a  directed 
graph  representation  of  the  wffs.  The  representation  of  a  wff  is  the 
collection  of  the  representations  of  its  disjuncts.  Each  disjunct  $   is 
represented  as  a  pair  of  graphs — a  ptr  graph  representing  the  conjunc- 
tion of  ptr  predicates  in  the  wff  <j>,  and  a  key  graph  representing 
the  array  predicates.  There  is  a  coupling  between  these  two  graphs, 
namely,  the  boundaries  of  the  array  segments  of  the  key  graph  are  defined 
by  the  pointers. 

The  construction  of  thfe  ptr  graph  tt  of  a  disjunct  <j>  is  des- 
cribed below.  A  partitioned  array  (key)  graph  will  be  constructed  later. 


3.1.3.1  Graph  Construction 

The  ptr  graph  tt  will  have  a  vertex  for  each  ptr  variable  re- 
ferred to  in  4>.  For  each  ptr  predicate  (u  +  k  <  v)  in  <|>,  we  put  a 
directed  edge  from  u  to  v  and  label  it  "k,."  Note  that  k,  is  a  signed 
integer,  and  the  relation  is  always  £.  The  graph  tt  may  have  more  than  one 
edge  from  a  vertex  u  to  a  vertex  v.  An  example  of  a  disjunct  4>   and  its 
pointer  graph  appears  in  Figure  3.2.  In  constructing  the  ptr  graph 
the  following  equality  axiom  is  embedded:  for  any  ptr  u,  u  +  0  <  u. 

3.1.3.2  Subsumption  for  Ptrs 


The  graph,  as  constructed  in  the  preceding  section,  may  have 


■C 


! 
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(|>  =  1   <  j  <  i  <  n  and  array(l  ,j-1)  <  x.  and 

arrayQ  ,i)  £  sorted (i+1  ,n)  and  j  <  i 
(<J>  is  the  premise  of  the  verification  condition  VI  of 
Figure  2.3). 


The  ptr  graph  tt  of  <j>  is 
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Figure  3.2  An  Example  of  a  Disjunct  and  Its  ptr  Graph 
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more  than  one  directed  edge  from  a  vertex  u  to  a  vertex  v0  The  rule  of 

subsumption  replaces  all  edges  from  vertex  u  to  v  by  a  single  edge.  The 

label  on  this  single  edge  is  a  label  on  one  of  the  previous  edges  (see 

Figure  3.3). 

Let  (k-,,  k?,  .  .  .,  k.}  be  the  set  of  labels  (integer  constants) 

on  edges  from  u  to  v.  Then  we  delete  all  these  i  edges,  and  replace  them 

by  a  single  edge  with  the  label  k  v  =  max  {k, ,  k9,  .  .  .,  k.}. 

max        I  c  i 

Remark  1 :  The  rule  of  subsumption  for  ptrs  is  an  enriching  rule. 
3.1.3.3  Transitivity  Rule  for  Ptrs 


For  any  pair  of  edges  (u  +  k  <  v)  and  (v  +  k2  S   w)  of  the 
ptr  graph,  add  a  new  edge  (u  +  k,  +  k2  ^  w). 

By  applying  this  rule  to  a  ptr  graph  as  long  as  it  yields  new  edges,  we 
construct  the  transitive  closure  of  this  ptr  graph. 

Remark  2:  The  transitivity  rule  is  an  enriching  rule. 

3.1.3.4  Partitioning  of  the  Array 


m 

im 

m 
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We  now  come  to  a  simple,  but  powerful,  idea  of  the  theorem 
prover:     express  both  premise  and  conclusion  in  terms  of  a  common  set  of 
array  segments.     To  do  this,  we  partition  the  array  (l,n)   into  nonover- 
lapping  segments  so  that  an  array  segment  referred  to  in  either  of  the 
wffs  ft  or  to  is  the  union  of  a  few  contiguous  segments  in  the  partition. 
Using  these  partitioned  segments,  equivalent  wffs  ft     and  go     are  obtained. 
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A 


Figure  3.3  Ptr  Graph  of  Figure  3.2  after  Subsumption 
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Figure  3.4     Transitive  Closure  tt    of  the  Ptr  Graph  it  of 
Figure  3.3 
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Consider  an  array  setment  array  (s,t).  It  partitions  the 
entire  array  into  three  segments:  x  oo.  .  .  x  , ;  x  .  .  .  x.  ;  x.  .  .  . 
x  ooi     some  of  which  may  be  empty.  If  we  overlay  one  partition  on  another, 
we  obtain  their  product.  The  array  partition  needed  to  express  n  and  u>  in 
terms  of  a  common  set  of  segments  is  obtained  as  the  product  of  all  indi- 
vidual partitions  defined  by  the  array  segments  occurring  in  either  Q. 
or  go.  Each  linear  ordering  of  the  boundaries  used  in  o,    or  go  defines  a 
partition  of  the  array. 

The  partitioning  procedure  collects  the  relevant  boundaries, 
and  produces  all  linear  orderings  of  these  boundaries  in  the  context  of 
the  partial  ordering  specified  by  the  ptr  graph  it  of  n. 

The  set  B  of  boundaries  is  constructed  as  follows: 
Initially,  B  «-  {-°°,  +°°} 

for  each  array  segment  array  (s,t)  referred  to 
either  in  9,   or  in  go  do 
B^BU  {s-1,  s,  t,  t+1} 
endfor 
Consider  a  maximal  chain  of  the  following  kind,  in  the  context  of  the 
given  ptr  graph  tt  of  £2  : 


m 


C  :   -  =  bQ  b1   b2  0   .   .   b2q  b2q+1   =  +» 


where  for  1   <  i   <  q 

b-.  and  b?.+,   are  boundaries  in  B,  and 

b2i  =  ]  +  b2i-T  and 

b2i   "  b2i+l 
as  implied  by  the  ptr  graph  tt  of  ft 
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If  every  boundary  b  of  B  either  appears  in  C,  or  is  equal  to  a  boundary 
appearing  in  C,  then  a  composition  (product)  of  the  partitions  is  read- 
ily obtained: 
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bQ  to  b^  b2  to  b3;  .  .  .;  b,,  to  b2q+1. 

However,  if  for  some  boundary  pair  b  and  1  +  b  of  B  at  least  one  of  them 
is  not  in  C,  then  we  resort  to  case  analysis.  We  find  the  largest  j  such 
that 

b  <  b 

Clearly,  it  is  not  known  if  b  .  ,   <  b  or  b.+,   >  b.     For  otherwise,  we 
either  have  a  larger  j,  or  a  longer  chain.     The  wff  ft     is  equivalent  to 
the  disjunction  of  ft,  ,    ft      where 


,     =  ft    and  (b  <  b.+, ) 
Eft    and  (b  >  b.+] ) 


Proving  ft    f=  w  is  equivalent  to  proving 


and 


ft,    Y 


OJ 


ft2  Y  w» 


and  in  each  of  ft,     and  ft      we  can  produce  longer  chains  of  boundaries 
than  in  ft.      We  apply  the  same  procedure  of  obtaining  maximal   chains  of 
boundaries  on  each  of  ft,     |=  oo  and  ft?    (=     w.     Clearly,  this  process  will 
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terminate.     Let  ft, ,  ft?,  ft_,    .    .    . ,  fi.  be  the  decompositions  thus  ob- 
tained; in  each  of  ft.,  the  boundaries  can  all   be  put  into  one  chain.  I 
Figure  3.5).     The  following  lemma  immediately  follows: 


Lemma  1     Let  ft     =  ft.     or_  ftp     or 


.   .  of  ft„     where  ft.'s     are  the  result 
C  i 


of  decompositions  of  ft    made  while  ordering  the  boundaries.     Then 

ft    |=  =|  ft 

Note  that  the  disjuncts  ft1 ,  ft       .   .    . ,  ftQ  differ  only  in  the  ptr  pre- 
dicates; they  all   have  the  same  set  of  array  predicates.     The  ptr  graph  of 
ft    defines  a  partial   ordering  on  the  set  of  boundaries  B.     From  this  par- 
tial  ordering,  say  L,  we  are  obtaining  all   linear  orders  L-, ,  Lp,   .   .   ., 
Lp.     Hence 

L  Y  =l  Li  9L  Lo  °H  •   •   •  2H  Lc* 

If  ft     =  S     and  S     where  S     and  S     are  respectively  the  set  of  ptr,  and 

it  a       tt      a 

array  predicates  of  the  disjunct  ft,  then 

ft.  =  L.  and  S  . 
i     i  a 

Lemma  2     For  i   j  j,   (ft  and  ft .  and  ft,)   is  unsatisfiable. 

Construction  of  Key  Graph 

We  construct  the  key  graph  a  of  a  disjunct  <j>  in  the  context  of 
a  given  partition  of  the  array  defined  by  a  linear  ordering  L  of  boundaries 
For  each  segment  array  (s,t),  there  are  two  vertices:  minx  (s,t)  repre- 
senting the  minimum  key,  and  maxx  (s,t)  representing  the  maximum  key  of 
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Let  a)  e  1  <  i  -  1  <  n  and  array(1  ,i-l )  <  sorted (i,n) 
and  he  1  <  j  i  i  <  n  and  S     and  j  >  a  where 

Sa  =  array(l,j-l)  <  array (j,j)  and  array(l ,i)  <  sorted(i+l ,n) 

(The  verification  condition  V2  in  Figure  2.3  is  9.  |=  co.) 

The  set  of  boundaries  B  =  {-»,+«,0,l ,i-l ,i,n,n+l ,j-l,j,j+l,i+l} 

Maximal   chain  C  induced  by  the  ptr  graph  7T  =  l<j  =  i<noffi  is 

-°°  0  1    i-1   i  n  n+1  +°° 
There  are  2  linear  orderings  of  boundaries  in  B 
Q.  |=  =1  ft,  or  ftp*  where 

Q.   =  1   <   i  =  i  <  n  and  S 
1  °  a 

C,  :   -°°  0  1   i-1    i   i+1   n  n+1   +°° 

partition   :  array(-°°,Q);     array(l  ,i-1 ) ;     array(i  ,i) 
array(i+l ,n);  array (n+1 ,+°°) 


fi«  =  1  <  J  =  1  : 

Co 


a 


:  -ooO  1  i-1  i  i+1  +°° 
partition  :  array(-°°,0);  array (1  ,i-1 ) 
array( i , i ) ;   array( i +1 ,+») 


Figure  3.5  An  Example  of  Ordering  Boundaries  and  Partitioning  the  Array 
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the  segment.  The  edges  in  the  key  graph  are  labeled  using  the  subarray 
relationships  induced  by  the  array  predicates  S  of  the  disjunct  <K  The 
following  axioms  are  embedded  in  the  construction  of  this  key  graph  a 
from  the  set  S  of  unpartitioned  array  predicates. 

-  If  array  (s,t)  is  nonempty,  then  minx  (s,t)  <   maxx  (s,t) 

-  A  subsegment  of  a  sorted  array  segment  is  sorted 

-  If  array   (s,t)  and  array  (u,v)  are  subsegments  of  a  sorted 
segment  and  if  t  <  u  then  array  (s,t)  <  array  (u,v) 

-  If  s  =  t  then  array  (s,t)  £  array  (s,t) 

-  If  two  array  segments  are  related  by  R,  then  their  respec- 
tive subsegments  are  also  R-related. 

An  array  segment  array  (sl,tl)  is  a  subsegment  of  array  (s,t)  if  s  < 
si  <  tl  <  t.  The  sorted  (s,t)  predicate  is  represented  as  a  pseudo-binary 
relation:  array  (s,t)  sorted  array  (s,t).  These  axioms 
are  used  to  put  labeled  edges  between  vertices  of  the  key  graph  :  for 
each  array  predicate  (array  (s,t)  R  array  (u,v)),  we  put  an  edge  from 
maxx  (s,t)  to  minx  (u,v)  and  label  it  with  R  (see  Figure  3.6).  We  do 
not  construct  a  key  graph  in  the  absence  of  a  partition.  Thus,  when  con- 
sidering a  key  graph  a  "context"  is  always  present. 


a 

m 


Remark  3:  a  and  L  (=  =1  S  and  L,  where  L  is  the  linear  ordering  of  boun- 
daries  which  defined  the  present  partition. 

Remark  4:  The  negation  of  the  predicate  sorted  (s,t)  is  not  represent- 
able.  In  Section  3.2.2,  we  prove  (Theorem  4)  that  representing  sorted  (s,t) 
will  not  be  necessary. 
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f  minx  ( 1 ,  j -1  )j W  maxx  ( 1 , j -1 ) 


Figure  3.6   Key  graph  for  S  in  the  context  of  the  Partition  C.  of 
Figure  3.5    a  ' 


3.1.305  Rule  of  Subsumption  for  Array  Predicates 

The  rule  of  subsumption  replaces  all  edges  from  a  vertex  v,  to 
^2  by  a  single  edge.  The  label  on  this  single  edge  is  a  label  on  one  of 
the  previous  edges. 

Let  (k-,,  kp,  .  .  .,  k.}  be  the  set  of  labels  (integer  constants 
and  sorted)  on  edges  from  v,  to  v2.  Then  we  delete  all  these 
i  edges  and  replace  them  by  a  single  edge  with  the  label  k    = 
max  {k, ,  k2,  .  .  .,  k.}.  For  the  purpose  of  this  rule  we 
define  the  label  sorted  to  be  less  than  any  other  label. 

The  rule  is  identical  to  the  rule  of  subsumption  for  pointer  predicates. 
Note  that  maxx  (s,t)  <  minx  (s,t)  implies  sorted  (s,t)  since  all  the  ele- 
ments of  array  (s,t)  are  then  equal,  while  the  converse  is  not  true.  For 
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this  reason,  we  defined  the  label  sorted  to  be  the  smallest  label  (see 
Figure  3.7). 

Remark  5:  The  rule  of  subsumption  for  array  predicates  is  an  enriching 
rule. 


(a)     A  Key  Graph  a 


(b)     Transitive  Closure  a 
of  Key  Graph  a 
(self-loops  not  shown) 


Figure  3.7     A  Key  Graph  and  Its  Transitive  Closure 


3.1.3.6     Transitivity  Rule  for  Array  Segments 


The  transitive  closure  of  key  graph  a  is  obtained  in  a  way 
similar  to  that  of  ptr  graph  tt. 


If  (v-!  +  k]   S  v2)  and  (v2  +  k2  <  v3) 


then 


(v]   +  k]  +  k2  <  v3) 


Recall   that  the  labels  k  can  be  either  integer  constants,  or  sortedo 
For  the  purpose  of  this  rule,  we  define  the  label   sorted  to  satisfy: 
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1.  sorted  <  any  other  label 

2.  sorted  +  any  label  =  sorted  =  any  label  +  sorted 

Remark  6:     The  transitivity  rule  for  array  segments  is  an  enriching  rule. 

3.1.3.7     Rule  of  Local    Implication 

This  rule  lets  us  conclude  a  global   property  like  <J>  f=  p,  where 
p  is  a  ptr  or  key  predicate,  and  <j>  is  in  a  certain  form,  from  a  local 
property  that  p    |=  p,  the  p     being  a  particular  predicate  of  $. 

Def  11     A  disjunct  <j>     =  it    and  a     is  an  enriched  disjunct  with  respect 

to  a  set  B  of  boundaries  if 

*  * 

1.  The  ptr  graph  tt    of  $     defines  exactly  one  linear  ordering 

L  of  the  boundaries  of  B 

2.  The  array  predicates  of  a    have  been  expressed  using  the 
segments  defined  by  the  partition  induced  by  the  linear 
ordering  L  of  the  boundaries 

3.  Both  tt  ,  and  a     are  transitively  closed 

Let  <!>,   =  tt..   and  a,   be  an  enriched  disjunct.     Let  (Ju  =  tt^  and 

oio  be  a  disjunct  such  that  the  vertex-set  of  tt2  is  the  same  as  that  of 

*  * 

tt,   and  the  vertex-set  of  cto  be  the  same  as  that  of  a-,. 

Def  12     For  each  predicate  (u  Rp  v)  of  4>2,  the  corresponding  predicate 

* 
in  <fr,   is  defined  as  follows: 

1.     If  there  is  an  edge  from  u  to  v  labeled  with,  say,  R,   in 

a-, ,  the  corresponding  predicate  is  (u  R-,   v). 
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*         *  *       . 

fi     =  tt     and  a   ,   where 


minx(i+l,n)  J Wmaxx(i+l,n) 

y..     ^sortea^      y. 


a 


is  the  transitive  closure  of  Figure  3.6 


#1 


The  conclusion  co       in 

the  context  §f  the  partition 

defined  by  ^    is 


00 


#1     . 


immediately  follows  from  Oj   by  the  rule  of  local    implication 


* 
Figure  3.8    An  Enriched  Disjunct  ^   of  Q  of  Figure  3ob 
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2.  If  there  is  no  edge  from  u  to  v  in  a.,  then  the  correspond- 
ing predicate  is  (u  null  v),  an  empty  predicate,  which  is 
defined  to  be  true  in  all  interpretations.  (Intuitively, 
take  null  as  "-°°  <." 
Consider  a  disjunct  ox.   of  the  conclusion  u>  to  be  made  from  an  enriched 

it  U 

disjunct  n  .  Let  w,  be  the  rewritten  version  of  w,  using  the  partitioned 

#1 
array  segments.  Since  u),  is  equivalent  to  oo-, ,  in  this  context,  it  follows 

that 


it  *  4 

n   Y  a),  iff  a   |=  oo^ 


# 


iff  fi    (=  p  for  every  predicate  p  of  w. 

Because  ft      is  an  enriched  disjunct,  we  can  make  the  following  stronger 
statement 

for  any  predicate  p  of  u>, ,  the  corres- 

* 
ponding  predicate  P  of  ft     is  such 

that  P  |=  p  (3.2) 


ft*  H  u*         i  f  f 


We  shall   refer  to  (3.2)  as  the  rule  of  local   implication.     A  proof  of 
the  validity  of  this  rule  is  given  in  Section  3.2.2. 

3.2     Basic  Theorem  Prover  is  a  Decision  Procedure 


An  informal,  but  complete,  description  of  the  basic  theorem 
prover  is  contained  in  Algorithms  1,  2,  3  and  4  in  the  following  pages. 
This  section  proves  that  the  theorem  power  is  a  decision  procedure  for 
flf  u,  where  both  ft  and  to  are  wffs.     The  theorem  prover  will   be  extended 
in  Section  3.3  to  prove  ft  |=  w  where  either  or  both  of  ft  and  o>  can  be 
augmented  wffs. 
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That  the  basic  theorem  prover  terminates  follows  immediately 
by  considering  the  "length"  of  the  conclusion  go.  In  Algorithm  3,  and  4, 
we  delete  either  a  whole  disjunct,  or  a  predicate  of  a  disjunct  from  oo 
in  eyery   iteration.  In  the  present  section,  we  shall  be  occupied  with 
the  proof  that  the  basic  theorem  prover  gives  correct  answers,  that  is, 


when  the  basic  theorem  prover  terminates,  the  boolean 
variable  -  istheorem  -  is  true  iff  it  is  indeed 


(3.3) 


the  case  that  the  premise  ft  logically  implies  the 

conclusion  w. 

The  core  of  the  proof  is  that  the  rule  of  local   implication  is 
valid.     The  structure  of  the  proof  follows  the  structure  of  the  algorithms 
closely.     We  show  (1)  that  the  satisfiability  of  a  graph  is  decidable 
and  that  it  is  obtained  as  a  by-product  of  transitive  closure,  and  (2) 
that  ft    |=  p  iff  p  follows  from  ft    by  local   implication. 

3.2.1     Model   Construction 

Given  an  enriched  disjunct  4>     =  tt     and  a  ,  we  want  to  construct 
a  model   for  it  by  assigning  values  to  ptrs  and  keys. 

Without  loss  of  generality,  we  can  assume  that  the  vertex  0  is 

present  in  the  ptr  graph  tt  ,  for,  if  it  is  not,  introduce  it  by  anding 

tt     with  0  <  0.     This  vertex  is  then  assigned  the  value  zero.     A  model   for 

*  * 

tt     is  constructed  first,  and  then  a  model   for  a     is  similarly  constructed. 
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procedure  basic  theorem  prover  (ft  :  wff  (=  w  :  wff) 
is  theorem  «-  true 

if  a)  is  empty  then  oo  •*■  false  endif 
for  each  disjunct  ft,  of  ft  and  while  istheorem  dp_ 

provetheorem     (  ft,  |=  oj) 
end  for-while 


endproc 


Algorithm  1 


procedure  provetheorem     (ft,    :  disjunct  (=  w  :  wff) 

construct  the  ptr  graph  tt  of  ft,;  apply  subsumption. 

If  7T  is  satisfiable 

then  {see  Algorithm  5  of  Section  303.1} 

collect  the  boundaries  referred  to  in  ft,,  and  oj  into 

a  set  B. 
for  each  linear  ordering  L,  according  to  it,  of 

boundaries  of  B 
and  while  istheorem  do 

construct  the  key  graph  a  of  ft,  for  this  parti- 
tion defined  by  L 
apply  subsumption  on  a 
istheorem  ■*■  provelemma  (tt  and  L  and  a  ^  oo) 
end  for-while 
endif 
endproc 


Algorithm  2 


7T    +-  transitive  closure  of  ptr  graph  of  fi 


11 


* 

a     *■ 
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function  provelemma  (ft,,    :  disjunct  |=  reference  oj     :  wff) 
returns  value  of  local   islemma   :  boolean 


transitive  closure  of  key  graph  of  ft,, 

*  * 

islemma-*-  (it     and  a     is  unsatisfiable) 

if  not  (to     is  empty  or  islemma)  then 

choose  a  disjunct  oo,   of  oo  ;  delete  oo,   from  oo 

,  •*■  equivalent  of  oo,  expressed  using  the  partition 

defined 


Wn 


# 


islemma  +■  does   (tt     and  a  )   imply  (oj,   or  oo  )? 


endif 


endfunction 


Algorithm  3 
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*  # 

function  does  (ft       :  enriched  disjunct)   imply  (<*>..    :  partitioned 

disjunct  or  reference  to     :  wff)? 
returns  value  of  local   implies   :  boolean 

repeat 

choose  a  predicate  p  of  go,  ;  delete  p  from  to 

a  * 

until  u),   is  empty  orfi,,  ^  p 

it 

vf_  nn  Is  P  {local   implication} 


# 


then  implies  «-  true 

* 
else  implies  «-  provelemma  (ft,,  and  p  |=  to,  or  to  )  and 

provelemma  (ft-i-.  and  not  p  f=  to  ) 

{see  Theorem  4  of  Section  3.2.2} 

endif 


endfunction 


Algorithm  4 
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Model   for  tt 


The  model   for  tt     is  constructed  iteratively.     Let 
assigned  =  subset  of  vertices  of  tt     to  which  values  are 

already  assigned,  such  that  for  u-. ,  u?  e  assigned 
(u-,  +  k-,  <  Up)  of  tt     is  satisfied,   i.e.,  valueof 
(u,)  +  k-j   <  valueof  (u2). 

* 
This  model   is  then  extended  by  choosing  an  arbitrary  vertex  v  of  tt  , 

which  is  not  in  assigned.     If  no  such  vertex  exists,  then  the  construction 

*  * 

of  a  model   for  tt    is  finished,  and  tt     is  satisfiable.     Consider  the  fol- 
lowing set: 

* 
S  =  predicates  of  tt        to  be  satisfied  by  v 

=  {(u.  +  k.  <  v)       u.   e  assigned} 
1         i  '     l       a 

U  {(v  +  k,  ^  u.)    |   u.  e  assigned} 

J  J  J 

The  value  to  be  assigned  to  v  should  be  such  that  all  the  predicates  in 
S  are  indeed  satisfied,  and  hence 

assigned  «-  assigned  U  {v}, 

provided  the  label   on  the  self-loop  at  v  is  zero0 

Let  V  „     =  max  of  vail  and  V   .     =  min  of  valJ,  where 

max  mm 

vail  =  {valueof  (u.)  +  k.    |    (u.  +  k.  <  v)  e  S}  U  {-«>} 


valJ  =  {valueof  (u.)  -  k.       (v  +  k.  <  u,)  e  S}  U  {+°°} 

J  J  J  J 


£-x: 


■ 
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If  ptr  v  is  assigned  a  value  V  such  that 


V        <  V  £  V   • 
max  mm 


(3.4) 


■ 
i 


;W'" 
3li 


ski: 
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then  all   the  predicates  in  S  are  satisfied,,     However,  if  the  self-loop 
at  v,   (v  +  k  <  v) ,  is  such  that  the  label   k  >  0,  this  predicate  is  not 
satisfiable  in  any  interpretation,  and  any  model   construction  cannot 
proceed  further,, 

Now,  assuming  that  the  self-loop  at  v  is  labeled  with  zero, 
we  show  that  a  value  V  satisfying  the  inequality  (3.4)  must  exist.     Let 


max 


be 


■5 


valueof  (u, )  +  k,    (without  loss  of  generality  on  the  subscripts 
i  of  u. )»  and  V   .     be  valueof  (iu)  -  k2.     Thus 

(u-.  +  k-i    _  v)  e  S  and  (v  +  k2  <  u2)  e  S. 

Since  tt     is  transitively  closed,   (u,  +  k     <  u2)  e  tt     for  some  k     greater 
than  or  equal   to  k,   +  k2,  and  since  u-.   and  u?  are  in  assigned,  we  have 

i 

(u,  +  k    <  u«)  £  S 


by  our  hypothesis  on  the  set  assigned, 


valueof  (u,)  +  k     <  valueof  (u~) 
and,  therefore 


V        <  V  .   . 

max        rain 


The  vertex  v  can,  therefore,  be  assinged  any  finite  value  in  the  range 
•-max'  min-"* 
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Model  for  a 


The  boundaries  of  array  segments  are  defined  by  ptrs  and  hence 

* 
a  model  for  a  can  be  constructed  only  after  a  model  for  the  ptrs  is  given 

If  array  (s,t)  is  a  segment  used  in  a  ,  we  have,  in  general, 


minx  (s,t)  <  maxx  (s,t) 

If  s  =  t  then  this  becomes  an  equality.     We  remark  that  the  labels  or 
self-loops  of  a     cannot  be  negative.     A  positively  labeled  self-loop  is 
clearly  unsatisfiable.     Thus,  these  labels  can  only  be  either  sorted 

•k 

or  zero.     For  each  pair  of  vertices  minx  (s,t),  maxx  (s,t)  of  a  ,  we  will 
assign  a  single  value.     This  assignment  clearly  satisfies  all   self-loops, 
and  sorted  predicates.     Once  this  decision  is  made,  the  model   construction 
for  a     is  identical   to  that  for  tt  . 

3o2.2     Unsatisfiability  and  Local    Implication  Theorem 

Theorem  1      (Unsatisfiabil ity  Theorem) 

The  ptr  graph  tt  is  unsatisfiable  iff  the  transitive  closure 
tt    of  tt  has  a  self-loop  whose  edge  label   is  positive. 

Proof     We  know  from  Remarks  1  and  2  that 

i      i    * 

TT    |-     =|   TT 

* 

Thus,  tt  is  unsatisfiable  iff  tt     is. 


(^b)         If  tt     has  a  self-loop  at  v  with  a  positive  label   k,  clearly 
(v  +  k  <  v)   is  unsatisfiable  in  any  interpretation. 
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(=^)       It  is  well   known  that  the  transitive  closure  G    of  a  graph  G 
will   have  a  self-loop  at  a  vertex  v  iff  G  has  a  directed  cycle 

(not  necessarily  of  length  1)  passing  through  v.     The  rule  of 

* 
transivity  is  such  that  the  label   on  the  self-loop  at  v  in  tt 

is  not  less  than  the  run  of  labels  of  edges  in  any  directed 

cycle  of  tt  passing  through  v„     Thus,  if  tt  has  no  directed 

* 
cycle  with  positive  edge-label   sum  passing  through  v,  then  tt 

does  not  have  a  self-loop  at  v  with  a  positive  label.     For  such 

* 
a  tt  ,  we  can  indeed  construct  a  model    (see  Section  3.2.1),  and 

hence  tt  is  satisfiable.  I 

Corollary  to  Theorem  1   (Unsatisfi ability  of  Key  Graphs) 

The  key  graph  a,   in  the  context  of  an  enriched  ptr  disjunct, 
is  unsatisfiable  iff  the  transitive  closure  a    of  a  has  a  self- 
loop  whose  edge  label   is  positive. 

Theorem  2   (Local    Implication  Theorem) 

*  *  i 

Let  tt     be  a  transitively  closed  ptr  graph.     Then  tt    |= 

(u  +  k?  <  v)  iff  the  corresponding  predicate  (u  +  k,  <  v) 

* 
in  tt     is  such  that  k-,   >  k2<> 

Proof 


(4=)       Obvious. 

(-$?)       we  prove  that  if  k-j  <  k2  then  tt    Y   (u  +  k2  -  v)*     Assi9n  u  an 
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arbitrary  value,  and  then  assign  v  a  value  equal   to  valueof  (u) 
+  k-..     Set  assigned+(u,v}.     Now,  we  can  complete  the  construc- 
tion of  the  model  as  in  Section  3.2.1.     Clearly,   (u  +  k~  <  v) 
is  false  in  this  model „  I 


Corollary  to  Theorem  2 


Theorem  2  holds  for  a  transitively  closed  key  graph  a  ,  and 
array  predicate  (u  +  k~  <  v). 


Theorem  3 


Let  ft    be  an  enriched  disjunct,  and  co    a  disjunct  partitioned 
with  respect  to  the  linear  ordering  of  boundaries  defined  by 


ft  .     Then 


o*  b      # 
ft      |=    0) 


iff 


either,  for  every  predicate  p  of  to  , 

* 
ft     locally  implies  p,  or 


ft     is  unsatisfiable 


Proof     by  Theorem  1  and  repeated  application  of  Theorem  2Q| 

When  an  enriched  disjunct  it    and  a    does  not  imply  a  predicate 
p  of  a),,  we  consider  two  cases  (refer  to  Algorithm  4,  if -statement): 


7T  TT 

tt     and  a     an_d  not  p  f=  to 
tt     and  a     and  p  |=  (w,   o_r  w) 


(3.5) 
(3.6) 


If  p  is  sorted  (s,t),  not  p  cannot  be  represented  in  our  scheme  (Section 
3.1.3.4).  Hence  a  proof/disproof  of  (3.5)  cannot  be  obtained  in  this 
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deductive  system.     The  following  theorem  avoids  this  problem  by  showing 
that  if  p  is  a  sorted-predicate,  then  the  proof  of  (3.5)  is  equivalent 
to  the  proof  of  (it     and  s  <  t)  an_d  a    and  minx  (s,t)  <  maxx  (s,t)  f=  w 
when  a    ^  a  sorted-predicate,  p.  | 

Theorem  4 


Let  (J>     be  an  enriched  disjunct  and  array  (s,t)  be  a  segment 

*  * 

in  the  partition  defined  by  <j>  .     Further  assume  that  <|>    Y 

sorted  (s,t).     Then 


it  an 

<J>     and  not  sorted  (s,t)  |=  i>     iff  <j>  =  ifr 


j|3 

3  wf 


Ift 


M 

■<u 

■  ■Will 

ra 

I  S 


Proof 


«-) 
(4>) 


where  \p     is  a  partitioned  wff  in  the  context  of  <j>  ,  and  <j> 

* 
<|>     and  s  <  t  and  minx  (s,t)  <  maxx  (s,t) 


If  <|>    |=  ijr  then  <j>     and  not  sorted  (s,t)  =  \p     is  obvious. 
Suppose  4>     and  not  sorted  (s,t)  ^  i|>  .     If  4>    is  unsatisfiable, 
or  if  array  (s,t)   is  empty,  the  theorem  is  trivially  true.     So 
let  <j>    and  not  sorted  (s,t)  be  satisfiable.     Consider  any  model 

M  for  (J)  ,  p  <j>  .     If  sorted  (s,t)   is  false  in  this  model,  then 

#  M 

|=  ]p  .     Given  a  model,  any  permutation  of  elements  of  sorted  (s,t) 

M 

conserves  the  minx  (s,t)  and  maxx  (s,t).     Thus,  if  we  permute 

the  elements  of  array  (s,t)  in  model   M,  all  predicates  of  \\>  , 

with  the  possible  exception  of  (array  (s,t)  R  array  (s,t))-type 

predicates,  must  still   be  true.     Since  \\>     is  a  wff  (in  our 
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system)  there  are  only  three  possibilities  for  (array  (s,t)   R 
array  (s,t)): 

1.     array  (s,t)  <  array  (s,t) 

2-     array  (s,t)  <  array  (s,t) 

3»     array  (s,t)  sorted  array  (s,t) 
The  first  one  is  unsatisfiable  in  every  interpretation.     The 
second  one  will   still   be  true  after  permutation.     The  third 
one  was  false  in  M,  and  if  the  permutation  is  an  appropriate  one 
it  may  be  true  in  the  resulting  model   M  0     But,  if  \p    had  this 

sorted  (s,t)  predicate,  (J>     and  not  sorted  (s,t),  being  a  satis- 

#  * 

fiable  disjunct,  cannot  imply  \p  .     Thus,  if  |=  <$>     and  not  sorted 

#  M 

(s,t)   then  \p    will   be  true  in  a  model   M    also  where  M     is  the 

result  of  permuting  elements  of  array  (s,t).     That  is,  ^    will 

be  true  in  any  model   for  <f>  .  | 

3.2.3     Basic  Theorem  Prover  is  Correct 

The  structure  of  the  proof  of  ft  \=  w,  as  constructed  by  this 
theorem  prover,   is  shown  in  Figure  3.9.     Theorem  3  yields     the  proof  at 
the  lowest  level   in  Figure  3.9;  the  remaining  proofs  are  proven  by  appro- 
priate recursive/ iterative  calls  (indicated  by  dotted  lines;  see  Algo- 
rithm 1 ,  2,  3,  and  4). 

We  omit  further  details  of  the  correctness  proof  of  the  basic 
theorem  prover. 
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Let  ft  =  fi1   or  n2,  where  ft1   is  a  disjunct,  and  ^  is  a  wff,  possibly  false 


ft  (=  co  -<- 


ft,    |=    CO 


and 


ftp    (=    CO 


for  each  linear  ordering  of  boundaries  of  ft,  and  oo 


prove  ft,  •  |=  co 


let  cj  =  co,  p_r  ojp  (^o  my  be  emPty) 


—  *\ 


fi-j .  |=  p  of  cotj   and  ft,.  (=  c/  with 


ft,  •  and  not  p  (=  cop 

and  ft,  .  and  p  |=  co,  o_r  ojp 


y 


p  deleted 


if  false 


Figure  3.9     Structure  of  the  Proof  of  ft  |= 


CO 
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3.3     Evaluation  of  Backward  Functions 


Recall   that  the  conclusion  oo  of  the  lemmas  ft  |=  co  to  be  proven 
was  an  augmented  wff,  possibly  involving  the  functions  subst,  exchb  and 
nsrtb  in  because  of  backward  substitution  (see  Chapter  2).     Similarly, 
ft  was  an  augmented  wff  possibly  involving  the  functions  subst  and 
unmodi f iedpartsof .     To  be  able  to  use  the  basic  theorem  prover  presented 
in  Section  3.1,  we  transform  these  augmented  wffs  by  evaluating  the  func- 
tions to  produce  simple  wffs  not  involving  any  of  these  functions. 

Strictly  speaking,  the  evaluation  of  functions  like  exchb, 
nsrtb,  etc.,  cannot  be  considered  part  of  theorem  proving.     However,  we 
include  it  here  because  it  plays  an  important  role  in  our  theorem  proving, 
and  because  this  evaluation  is  done  in  the  midst  of  the  theorem-proving 
effort. 

Given  the  lemma  ft  f=  go  to  be  proven,  the  subst  functions,  if  any, 
of  ft  and  co  are  evaluated  first.     Let  us  call   the  resulting  augmented 
wffs  SI  and  go.     The  premise  Si  can  be  considered  to  be  ft,  or  ftp  where  ft-,   is 
a  disjunct,  and  ft2  is  an  augmented  wff,  possibly  the  wff  false.     We  then 
prove  that  ft,  (=  go,  and  that  ft?  |=  oj.     In  Section  3.3.2,  we  describe  the 
proof  of  ft,  (=  oo.     (This  procedure  is  used  repeatedly  for  each  disjunct 
of  ft.)     The  boundaries  of  ft,   and  oo  are  collected  into  a  set  B.     Each 
linear  ordering  of  boundaries  defines  a  partition  of  the  array0     The  boun- 
daries collected  are  such  that  this  partition  has  single  element  array 
segments  array(s,s)  and  array  (t,t)   for  each     exchange  x    with  x. 
statement  (similarly  for  insert  statements).     It  is  then  a  simple  matter 
to  evaluate  the  exchb  and  nsrtb  functions.     The  resulting  oo  is  a  simple  wff. 
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We  now  describe  these  two  passes  of  evaluation  in  greater 


detail 
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3.3.1     First  Pass  of  Evaluation 

The  evaluation  of  the  subst  functions  is  the  simplest,  and 
constitutes  the  first  pass  of  our  evaluation.     Clearly,  before 

subst  t  for  u  in  \p 

can  be  evaluated,  all   subst  functions  of  the  augmented  wff  \p  must  be 
evaluated.     Assuming  that  ty  is  free  of  subst  functions,  the  ptr  expres- 
sion t  is  substituted  for  e\/ery  occurrence  of  the  ptr  variable  u  in 
the  augmented  wff  i|;,  which  may  have  only  exchb  and  nsrtb  functions. 

Remark  7:  Let  S  be   u  ■*-  t  statement.     Then,   the  entry  assertion  <j>E  = 
subst  t  for  u  i_n  \p<~  generated  for  an  exit  assertion  ^  is  such  that 

if  b  (\>r.     then     |=,   ^,  and 

if  jf  <|>E     then    fx     ^s 

where  I  is  the  result  of  execution  of  S  on  an  interpretation  I. 


The  boundaries  referred  to  in  the  conclusion  a>  and  current 
disjunct  of  the  premise  ft,  are  collected  by  Algorithm  5.  As  can  be  easily 
seen,  the  boundaries  included  in  B  are  such  that  the  partition  produced 
is  guaranteed  to  contain  appropriate  segments  needed  in  the  evaluation 
of  exchb,  nsrtb  and  unmodifiedpartsof  in  the  second  pass. 
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B  «*■  {-oo,  +»} 

for  each  array  segment  array  (s,t)  referred  to  either  in 

fi  or  in  a)  do 

B^-BU  {s-l,s,t,t+l} 
endfor 
for  each  exchb  x    with  x.   in  ip  occurring  in  gj  do 

B^-BU  {s-l,s,s+l,t-l  ,t,t+l} 
endfor 
for  each  nsrtb  x     below  x.    vn  ip  occurring  in  w  do 

B^-BU  {s-1  ,s,s+l  ,t-2,t-l  ,t,t+l} 
endfor 
for  each  unmodifiedpartsof  a  wrt  (s,t)  occuring  in  ft,   do 

B^BU  {s-1 ,s,t,t+l} 
endfor 

Algorithm  5:  Collecting  Boundaries 

3.3.2  Second  Pass  of  the  Evaluation 


The  second  pass  is  made  for  each  linear  ordering  L  of  the 
boundaries  collected  as  above.  None  of  the  functions  exchb,  nsrtb  and 
unmodifiedpartsof  changes  the  ptr  expressions.  While  the  first  pass 
has  an  effect  only  on  ptr  expressions,  the  second  pass  has  its  effect 
only  on  the  array  segments,  which  depend  on  the  context  L.  Again,  the 
evaluation  of  exchb  and  nsrtb  is  from  inside  out. 
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exchb 

Assuming  that  \\>   in  a  wff,  that  is,  that  \p   is  free  of  exchb  and  nsrtb 

functions, 

exchb  x    with  x.   in     \j> 

is  evaluated  in  the  context  of  the  partition  defined  by  the  current 
linear  ordering  of  boundaries.     The  wff  if;  is  expressed  as  \\>     using  the 
partitioned  array  segments.     Note  that  the  partition  produced  will   have 
single  element  array  segments  A  =  array  (s,s),  and  B  =  array  (t,t)   (see, 
second  for-loop  of  Algorithm  5).     The  exchb  is  evaluated  by  substituting 
B  for  A,  and  vice- versa,  in  every  array  predicate  of  ty  . 

nsrtb 

Again  assuming  that  ty  is  free  of  exchb  and  nsrtb  functions, 

nsrtb  x     below  x.   in  ty 

is  evaluated  in  the  context  of  the  present  partition.     The  wff  \\>  is  ex- 
pressed  as  \\>     using  the  partitioned  array  segments.     Note  that  since 
s-1,  s,   s+1 ,   t-2,  t-1 ,  t  and  t+1   are  included  in  the  set  of  boundaries, 
the  partition  produced  willi  have  single-element  segments  array  (s,s), 
array  (t-1, t-1),  and  array  (t,t).     The  boundaries  are  already  ordered, 
and  we  consider  the  two   cases  s  <  t,  or  s  >  t. 

Suppose  s  <  t.     Then  the  following  transformations  are  made 
on  the  array  segments  A  =  array  (u,v)  of  the  predicates  of  \p   : 
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1.  If  A  is  a  subsegment  of  array  (s,t-2)  then  A  is  redefined 
as  array  (u+1 ,v+l ). 

2.  If  A  is  the  same  segment  as  array  (t-l,t-l)  then  A  has 
the  new  definition:  array  (s,s). 

3.  The  definition  of  A  is  unchanged  otherwise. 

Now  suppose  s  >  t.  Then  the  following  transformations  are 
made  on  the  array  segment  A: 

1.  If  A  is  a  subsegment  of  array  (t+l,s)  then  A  is  redefined 
as  array  (x-1 ,y-l ). 

2.  If  A  is  the  same  segment  as  array  (t,t)  then  A  is  rede- 
fined as  array  (s,s). 

3.  Otherwise,  the  definition  of  A  is  unchanged. 


Remark  8:  Let  S  be  either  an  exchange  x  wi th  x.  or  an  insert  x  below 

xt  statement,  and  let  ^   be  the  corresponding  exchb  x  with  x.  j_n  ^s  or 

nsrtb  x  below  x.  in  i/;~  statement  where  \\>~   is  the  exit  assertion  of  S. 

Then  ij>s  is  true  in  M  ,  which  is  the  result  of  the  execution  of  S  on  M, 

iff  <j>c  is  true  in  M,  where  M  is  a  model  for  the  context.  More  formally, 

if  L  is  the  present  linear  ordering  of  boundaries  and  hL  then 

M 

if  |=  tyj.     then  |=,  i^c,  and 
M  t       M   b 

if  (^  <j)c  then  b*,  ipQ 
M  L       M   b 

Lemma  3   Let  S  be  a  straightline  program  segment,  that  is,  S  is  a  se- 
quence of  ptr  assignment,  exchange  or  insert  statements,  with  ^s  as  its 
exit  assertion,  and  let  <j>E  be  the  entry  assertion  of  S  obtained  as  above 
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in  the  context  of  L.     Then 

if    k  <f>F     then    f=,     ^,  and 

if    k  (J)E     then     p,     ip$ 
M 

where  M  is  a  model  for  the  linear  ordering  L  of  boundaries  and  M  is 

the  result  of  execution  of  S  on  M. 

Proof  by  repeated  applications  of  Remarks  7  and  8.1 
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unmodifiedpartsof 

The  description  of  the  evaluation  of 

unmodifiedpartsof  g  wrt  (s,t) 

is  somewhat  complicated  because  of  the  details  needed.  Intuitively, 
since  the  procedure  called  can  permute  the  elements  of  array  (s,t),  all 
predicates  of  a  which  depend  on  the  strict  subsegments  of  array  (s,t) 

are  deleted  from  a  .  In  addition,  the  predicate  sorted  (s,t),  if  pre- 

#  # 

sent  in  a  ,  is  deleted  from  a  .  The  complication  arises  from  the  pos- 
sibility that  the  current  linear  ordering  of  boundaries  partitions  the 
segment  array  (s,t)  into  smaller  segments.  In  such  a  case,  it  will  be 
necessary  to  temporarily  "join  together"  contiguous  segments  to  see  if 
the  entire  segment  array  (s,t)  is  related  to  other  segments  of 
array  (-oo,s-l )  or  array  (t+1  ,+=»). 

Let  array  (s,t)  =  S,;  S?;  .  .  .;  S  :  that  is,  the  S  .s  consti- 
tute the  p  subsegments  of  array  (s,t)  from  the  boundary  s  to  t.  Then 
the  evaluation  is  done  as  described  in  Algorithm  6. 
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<J>  «-  the  wff  false 
for  each  disjunct    a   of  g  do 
it.   *■  ptr  graph  of  a, 

for  each  linear  ordering  L  induced  by  tt    do 
(J)1   «-  L 
let  a,   be  the  disjunct  a,   expressed  using  partitioned 

segments 
for  each  array  predicate  (ARB)  of  a,   do 
cases 


neither  A  nor  B  is  an  Si :     <L   «-  <L   and  (ARB)  provided 

(ARB)   is  not  sorted  A 


A  is  an  Si 
B  is  not 


B  is  an  S. 
l 

A  is  not 
A  is  an  S,- 


B  is  an  S. 
endcases 


(Jx.  *•  <Jl  and  (array  (s,t)  R  B)  provided 
a.  has  predicates  S,R,B,  SpRoB,  .  .  ., 
S  R  B  such  that  for  1  <  j  <  p,  S.R.B  |= 

r   r  J   J 

I  I  I 

S.R  B  and  if  S.R.B  h  S.R  B  then 

J  J   J         J 

s.r'b  h  s.r"b 


as  above  with  A,  B  exchanged 


4>-|  is  unchanged 


endfor 

<}>  +■  <f),   or     <b 
endfor 


endfor 


Algorithm  6:  Obtaining  <J>  =  unmodifiedpartsof  a  wrt  (s,t) 
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This  completes  the  description  of  the  evaluation  of  functions. 
In  the  next  section,  we  present  the  theorem  prover  with  the  extensions 
required  by  the  evaluation,  and  give  arguments  to  establish  the  fact 
that  the  extended  theorem  prover  is  a  decision  procedure. 

3.4  Extended  Theorem  Prover  is  a  Decision  Procedure 

Let  us  briefly  review  the  backward  substitution  method  (Section 
2.2.2)  of  generating  the  verification  conditions.,  To  prove  -C4> | P | ^> ,  the 
program  P  is  decomposed  into  straightline  program  segments  S  and  we 
then  prove  {(f) |S|^},  where  <J>  and  ip  are  generated  from  Y,  and  the  loop  in- 
variants given.  Each  {<J>|S|iJ;}  is  proven  by  proving  the  generated  lemma 
<f>  (=  <{>d>  where  (k  is  the  entry  assertion  for  S  generated  from  ty   by  back- 
ward substitution.  We  recall  that  the  backward  substitution  of  [King  1969] 
is  such  that 


if  (=  <j>   then  f=,  \pt   and 

1       B  j 


if    Yi  4>b     then     t*«     ^ 


(3.7) 


where  I  is  any  interpretation,  and  I     is  the  result  of  executing  the 
program  segment  S  on  I.     Note  that  {<})R|S|^}  is  a  milder  statement  tha 
(3.7).     It  follows  immediately  that,  for  any  entry  assertion  <j>, 


{<j>  |   S  |  ij,}     iff 


(J)    f=    <}> 


B 


(3.8) 


In  general,  <j>B  has  many  disjuncts  which  are  unsatisfiable  in  every_  model 
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of  <}>,  making  it  unnecessary  to  consider  these0     The  diligent  reader  may 

have  noticed  that  the  contextual   backward  function  evaluation  of  the 

previous  section  may  generate  entry  assertions  <jv  which  do  not  satisfy 

the  property  (3.7).     The  wff  <|>-  is  essentially  4>B  from  which  some  dis- 

juncts  are  deleted. 

To  illustrate  this  dramatically,  consider  the  one-line  program 

segment  S  =  exchange  x.  with  x.,  the  exit  assertion  ij>~  being  sorted  (1  ,n) 

i  j  j 

The  <}>R  generated  by  true  backward  substitution  is  the  equivalent  of  that 
slhown  in  Figure  3.10a.     This  <j>R  does  imply  the  property  (3.7).     (For 
readability  we  have  not  written  cj>  in  disjunctive  normal   form.)     However, 
the  backward  function  evaluation  in  the  context  of  1   <  i  <  j  <  n  yields 
<Jv  shown  in  Figure  3.10b.     As  can  be  seen,  <t>r-  is  much  simpler  than  t|)R, 
whose  generation  does  not  depend  on  the  context  of  the  given  entry  as- 
sertion. 

Theorem  5     (Validity  of  contextual   backward  function  evaluation) 
For  a  straightline  program  segment  S, 

{(f)   |S|   ip}  iff  4>*  |=  <j>E 

Proof    Without  loss  of  generality,  we  assume  that  the  given  entry  as- 
sertion  is  <f>  ,  the  enriched  version  of  <j>.     Thus,  we  wish  to  prove 


{<!>     |   S   |   \l)}  iff  $*  |=  <J>F. 


(3.9) 


We  actually  prove  a  slightly  stronger  version  than  (3.9),  namely, 


*      * 


{<()        S  |   i/>>       iff  for  every  disjunct  <j>.  of  <|>  ,  <j>.  |=  <j>_. 
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)  <  x,  <  sorted(i+l ,j-l)  <  x,  <  sorted(j+l  ,n) 

)  <  xi  <  sarted(j+l ,i-l)   <  x.  <  sorted (i  +  l ,n) 

)  <  x.  <  sorted (i+l,n) 

)  <  x,  <  sorted(i+l,n) 

%J 

)  <  x.  <  sorted (j+l,n) 

)  £  x.  <  sorted(j+l ,n) 


*  ? 
S  =  exchange  x.  wi th  x. 

*  sorted(l,n) 

<j>R  =  <}>p  with  context  'true' 

e  1  £  i  <.  j  £  n  and  sorted(  1,1- 
or  1  £  j  <  i  ^  n  and  sorted (1 ,j- 
£r  1  £  i  £  n  <  j  and  sorted ( 1,1- 
or  j  <  1  <  i  5  n  and  sorted( 1,1- 
or  1  <  j  <  n  <  i  and  sorted (1  ,j- 
ojr  i  <  1  ^  J  ^  n  and  sorted (1  ,j- 
ojr  sorted ( 1 , n )  and  ( i   <  1 

or  i   >  n) 
and  (j  <  1 
or  j  >  n) 


(a)     <J)R:     Context-Free  Backward  Substitution  (Simplified) 


<j>F  in  the  context  of  1  <  i  <  j  <  n 

=  sorted(1 ,i-l)  <  x.  <  sorted(i-l ,j-l )  <  xi  <  sorted(j+l ,n) 

(b)     (j)  •     Contextual   Backward  Substitution   (Simplified) 


Figure  3.10     Contextual   and  Context-Free  Backward  Substitutions 
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where  <f>-.   is  the  entry  assertion  of  S  obtained  by  evaluating  the  back- 
ward functions  in  the  context  of  L . ,  the  linear  ordering  of  boundaries 

*  * 

defined  by  the  (enriched)  disjunct  <j>. „     That  <j>.  (=  <|>p. ,  say  for  i   =  1,  is 

shown  by  proving 


<$>R  and  L,  |=  =|  <J>F,   and  L 


El 


1 


Thus,  (jjp.  is  4>p,  or  (jv^  or  ...   . 
<1>R  and  L,  |=  (j^-,   and  L, : 

Suppose  <j>     and  L,  ^  <j>_,.     Let  M  be  a  model   for  0R  and  L, ,  such 

that    ^  <j)£i .     Then  ^^   and  L,   is  not  true  in  M.     By  Lemma  3  of  Section 

i 
3.3.2,  if  b(j  <j>E,  and  L,   then  p,  ip,  where  M     is  the  result  of  the  execution 

M 
of  S  on  M.     Since  M  is  a  model   for  <j)R,  this  contradicts  property  (3.7). 


^ri   and  Li  h  <J>d  and  L, : 

Suppose  <|>F1   and  L,    is  true  in  M,  and  Y  4>r.     By  property  (3.7) 
ti  i  M 

^i   \\j.     But  by  Lemma  3,   if  \=  <J>F,   and  L,   then  (=,   ip,  a  contradiction.  I 
M  M  '  M 

The  advantage  in  generating  <j>p's  rather  than  <|>R  should  be  ob- 
vious.    The  assertion  <J>D  will   have  as  many  disjuncts  as  there  are  linear 

D 

orderings  of  the  boundaries  collected  from  S  and  ^.  Several  of  these 
linear  orderings  are  of  no  concern  to  us,  since  we  only  need  to  prove 
that  execution  of  S  on  an  input  staisfying  <(>  results  in  \p.     We  do  not 
care  what  S  does  on  a  linear  ordering  of  boundaries  contradicting  the 
partial  order  specified  by  <j>. 
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Since  contextual   backward  function  evaluation  is  valid,  the 
extended  theorem  prover  is  a  decision  procedure  for  ft  f=  go,  where  u  and 
ft  are  augmented  wffs. 

3.5     Counterexample  Generation 

Whenever  the  theorem  prover  determines  that  ft  ^  go  it  is  pos- 
sible, in  this  system,  to  construct  a  model   M  for  ft  such  that  go  is  false 
in  M.     However,  it  should  be  realized  that  M  may  not  be  a  "counterexample 
to  the  program."     This  is  because  even  though  {<|>|P|iJ>},  the  loop  invariants 
given  may  not  be  strong  enough  to  prove  all   the  lemmas  generated.     Coun- 
terexamples will,  hopefully,  provide  clues  for  strengthening  the  loop 
invariants. 

Suppose  ft  ^  to.     Then  there  must  exist  (see  Algorithm  2:  prove- 
theorem)  a  linear  ordering  L,  ptr  graph  tt  and  key  graph  a  of  a  disjunct 
ft,   of  ft  such  that 


it  and  L  and  a  ^  .go. 


§ 


Let  to     be  the  partitioned  version  of  go  in  the  context  of  L, 


go  and_  L  |=  =|  go    and  L 


go     =  go-,  or  goq  or  . 


# 


.  or  go 


The  last  call    (from  either  Algorithm  2  or  4)  of  Algorithm  3:     provelemma 
gives  a  satisfiable  disjunct  ft,,,  and  go     =  go    such  that 
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"n  ¥ 


U) 


and  for  1  <  i  <  c, 


•  # 

ft,,   and  a),   is  unsatisfiable. 

*  # 

Thus,  a  model   M  for  ft,,   such  that  f  co       is  a  counterexample  to   (it  and 

m     c 

*  # 

L  and  a  |=  ca)  and  hence  to  ft  =  oo.     Since  ft,,  ^  w  ,  there  must  be  a  predi- 

#  *  * 
cate  p  in  to     such  that  ft,,  ft  p  (see  Algorithm  4).     The  disjunct  ft,,   and 

not  p  is  satisfiable,  and  a  model   can  be  constructed  as  in  Section  3.2.1 

* 
for  the  transitive  closures  of  ft,,   and  not  p. 
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4.  GENERALITY 


■C 
■ 


In  the  last  two  chapters,  we  have  seen  the  successful  applica- 
tion of  inference  rules  about  partitioning,  closure  and  local  emplica- 
tion  in  the  verification  of  programs  written  and  asserted  in  our 
languages.  Though  these  vital  inference  rules  are  developed  here  as  the 
result  of  severe  constraints  imposed  primarily  by  the  assertion  lan- 
guage, they  do  apply  to  a  wider  class  of  programs  manipulating  data 
structures.  We  now  give  several  examples  to  support  this  contention. 

4.1  Constraints  of  the  Present  Verification  System 
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The  verification  system  was  designed  with  the  specific  goal 
of  being  usable  in  SORTLAB  to  verify  the  correctness  of  student  pro- 
grams for  sorting  an  array.     Severe  constraints  were  imposed  on  the 
programming  and  assertion  languages  both  to  limit  the  class  of  programs 
to  sorting-type  problems  and  to  obtain  a  system  that  is  usable  in  a 
practical   situation.     Not  all   these  constraints  are  technically  necessary 
for  making  the  theorem  prover  a  decision  procedure,   though  they  have 
value  pedagogical ly. 

For  example,   the  verifier  can  be  enhanced  quite  easily  to  per- 
mit many  arrays,   temporary  variables,    ptr  expressions  like  j  +  8,  and 
predicates  like  array  (s,t)  -  3  <  array  (u,v),  which  means 


array  (s,t)  -  3  5  array  (u,v)    =  V.V.   (s^ist  and  u^j^v  ■+  x.-3^x.) 


However,   if  arbitrary  assignments  to  array  elements  are  allowed,  it  is 
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not  clear  how  the  verifier  can  be  extended  to  prove  the  key-preserving 
property  of  solving  algorithms. 

It  is  not  possible  to  characterize  the  class  of  programs 
provable  in  this  system  except  as  those  programs  that  can  be  written  in 
our  programming  language  and  for  which  sufficiently  strong  assertions 
can  be  made  in  our  assertion  language.     Theoretically  speaking,  all 
computable  functions  are  programmable  in  the  programming  language.     How- 
ever, for  most  computable  functions  strong  enough  assertions  do  not 
exist  in  our  assertion  language  that  permit  a  proof  that  the  correspond- 
ing program  computes  the  function.     Thus,  e.g.,  heap  sort  and  several 
merging  programs  can  be  written  in  the  programming  language,  but  strong 
enough  assertions  to  prove  that  these  programs  also  sort  do  not  exist 
in  our  assertion  language. 

4.2     Partitioning 


Several   properties  on  a  data  structure  can  be  expressed  as 
properties  on  its  substructures,  and  by  interrelationships  among  these 
components.     For  example, 

sorted  (s,t)   iff     s  ^  t  or  (for  all    u,  s  ^  u  <  t 
sorted  (s,u)  5  sorted  (u+l,t)) 
avl-tree  iff    empty- tree   (r)  or 

avl-tree  (left  (r))  and  avl-tree  (right  (r)) 
and     -1  ^  height  (left  (r))  -height  (right 
(r))  *  1 
A  typical    verification  condition  ft  |=  w  of  a  program  aiming  to  produce  such 
a  property  on  a  data  structure  is  of  the  following  kind:     the  conclusion 


■t»l 


9 


*!•■<« 


.dus 


H 

•X* 

BE 
a 

S'ttttfc 
.  m 

A II  ji 


72 


aj  refers  to  larger  parts  of  a  data  object  having  the  property,  while  the 
premise  Q,   refers  to  smaller  parts  of  the  data  object  which  have  the  same 
(or  similar)  property  and  contains  certain  interrelationships  between 
these  parts.  Proving  Q  f=  w  becomes  much  simpler  in  such  cases  if  both 
ft  and  a)  are  expressed  in  terms  of  a  set  of  common  parts  of  the  data  object, 
Partitioning  is  a  technique  which  decomposes  the  data  object  into  small 
enough  components  so  that  every  segment  of  data  structure  referred  to  in 
ft  or  co  is  a  union  of  some  of  these  components. 

4.3  Closure  and  Local  Implication 

Much  of  the  inefficiency  in  general  theorem  provers  can  be 
traced  to  their  inability  to  choose  appropriately  those  predicates 
of  the  premise  which  would  imply  a  certain  conclusion.  The  rule  of 
local  implication  completely  avoids  this  problem  by  specifying  the  pre- 
dicate of  the  premise  that  determines  if  a  given  predicate  of  the  con- 
clusion follows  from  the  premise.  It  should  be  noted  that  the  rule 
of  local  implication  is  valid  only  when  the  ptr  and  key  graphs  are 
transitively  closed. 

A  rule  of  local  implication  can  trivially  be  formulated  in  any 
deductive  system  if  all  possible  inferences  from  the  given  premises 
are  collected  as  the  closure  of  the  premise.  However,  this  may  not  be 
practical  either  because  it  takes  a  long  time  or  because  the  closure 
is  not  finite.  We  therefore  seek  inference  rules  yielding  only  finitely 
many  inferences  from  given  premises  and  obtain  the  closure  of  such 
rules.   In  the  context  of  proving  lemmas  about  parti tionable  properties 


73 


on  data  structures,  it  is  generally  possible  to  obtain  this  closure 
rapidly,  and  to  invent  appropriate  rules  of  local  implication. 

4.4  Examples 

Several  examples  from  the  literature  are  used  in  this  section 
to  support  our  contention  that  the  techniques  developed  for  SORTLAB  are 
in  fact  applicable  to  a  wider  class  of  programs.  The  treatment  of  these 
examples  is  necessarily  brief;  we  only  indicate  how  a  relevant  partition 
may  be  constructed.  We  also  assume,  without  further  ado,  that  ap- 
propriate extensions  are  made  to  the  programming  and  assertion  languages 
where  necessary. 

4.4.1  A  Geometric  Example 

Consider  finite  plane  maps  which  can  be  described  using  rectan- 
gles with  one  side  parallel  to  the  x-axis,  and  the  operations  union  (+), 
intersection  (•)  and  negation  (~i)  •  Thus  A  +  B  is  the  map  covered  by 
the  rectangle  A  or  B,  A.B  represents  the  map  common  to  both  A  and  B,  and 
-tA  represents  the  map  not  covered  by  A.  The  shaded  map  shown  below  can 
be  described  by  several  expressions. 
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For  example, 

(l-2-3-4)-"t(5-6-7-8)--i(9-10-ll-12) 

(l-14-15-4)--)(5-6-7-8)  +  (13-2-3-16)-n(9-10-ll-12) 

(l-2-3-4)-n(5-10-ll-8)  +  (6-9-12-7) 

The  problem  we  wish  to  consider  is:     given  two  expressions  E, 
and  E2,  decide  if  E,   and  E2  are  describing  the  same  map.     If  the  coordi- 
nates of  all   points  referred  to  in  E-.   and  Ep  are  constants,  the  problem 
is  trivial.     But,  if  the  points  are  arithmetic  expressions  (with  plus, 
minus  only)  of  free  variables  and  constants,  the  problem  can  be  answered 
by  decomposing  the  maps  described  by  E-.   and  Ep  as  follows. 

Let  the  rectangle  A  contain  a  corner  p  of  another  rectangle  B. 
Then,  p  splits  A  into  four  smaller  rectangles  A,,  A«j  A3  and  A.  as  shown 
below.     Repeat  this  process  until   none  of  the  partitioned  rectangles 


> 


splits 


contain  corners  of  other  rectangles.  Clearly,  each  original  rectangle 
is  a  union  of  some  of  these  parti oned  rectangles.  If  we  now  impose 
a  linear  ordering  on  these  partitioned  rectangles  (e.g.,  A  precedes 
B  if  the  coordinates  of  the  left-top  corner  of  A  are  (x,  ,y, )  and 
that  of  B  are  (x2,y2)  such  that  either  x-,  <  Xp  or  x-,  =  Xp  and  y-j  <  y2) 
the  original  expressions  E-,  and  Ep  can  be  rewritten  in  a  canonical  form 
now  and  E-,  will  be  equivalent  to  E2  if  their  partitioned  expressions  are 
identical . 
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4.4.2  Simple  Array  Examples 

All  the  verification  conditions  of  the  two  examples  given  in 
this  section  can  be  proven  by  partitioning  the  array  as  described  in 
Section  3.1 .3.4. 

4.4.2.1  Binary  Search 

The  example  given  in  Algorithm  7  is  a  classical  binary 
search  algorithm.  The  proof  that  the  algorithm  searches  correctly  a 
sorted  array  x(m. . . n)  for  an  element  z  does  not  depend  on  the  index  k 
being  equal  to  (i+j)  div  2;  this  particular  choice  of  k  only  makes 
the  algorithm  more  efficient  (0(log2(m-n))).  For  the  algorithm  to 
search  properly  it  is  sufficient  that  the  function  f  be  such  that  when- 
ever i  <  j,  i  *  f(i,j)  <  j.  The  verification  condition  for  the  loop  is 

sorted  (m,n)  and  i  <  k  <  j  and 

(z  i sin- array  (i,j)  or  z  notin-array  (m,n)) 

1= 

sorted  (mn,)  and  i  ^  k  and  xk  -  z  and 
(z  i sin- array  (i,k)  or  z  notin-array  (m,n)) 
or 

sorted  (m,n)  and  k  +  1   ^  j  and  x.    <  z  and 
(z  i sin-array  (k  +  1,  j)  or  z  notin-array  (m,n)) 
The  predicate  notin-array  is  the  negation  of  i sin-array,  where 


z  i sin- array  (s,t)  = 


s  =  t  and  x  =  z  or 


(for  some  u  such  that  s  S   u  <  t 

z  i sin-array  ( s , u )  or  z  i sin- array  ( u+1 , t ) ) 


76 


-  " 


r  ■■ 
'I 


H 


em 
■ 
m 


to 

3!\:: 

3;!* 

!» 
2 

«  ;..v 


*sorted  (m,n) 

i  ■*■  m;  j  -*•  n 
while  i  <  j  do 

k  «■  f(i,j) 
if  xk  <  z 

then  i  «-  k  +  1 

else  j  ■*■  k 

endif 
*  i  S  j  and  (z  i sin- array  (i,j)  or  z     notin-array  (m,n)) 
endwhile  and  sorted  (m,n) 

found  -*-  (x-   =  z) 
*  (found  «-*■  z  i sin- array  (m,n)) 


Algorithm  7.     Classical   Binary  Search 


77 


4.4.2.2     Dutch  National    Flag  Problem 

The  problem  is  to  rearrange  the  elements  of  an  array  x  which 
are  those-val  ued  viz.,  either  red,  white  or  blue,   into  contiguous  red-, 
white-  and  blue-colored  segments  from  the  low  end  to  high  end  respectively. 
[Dijkstra  1976].     A  solution  to  the  problem  is  given  here  as  Algorithm 
8.     The  predicates  red,  white,  blue  or  array  segments  are  defined  as 
fol lows: 

c(s,t)  =  (s  <  t  and  for  all    u  such  that  s  <  u  <  t 

c(s,u)   and  c(u+l ,t) 
or  s  =  t  and  color  (s)   =  c 
or  s  >  t) 

where  c  is  to  be  substituted  by  red,  white,  or  blue.     The  backward 
function  evaluation,  and  partitioning  technique  of  Chapter  3  are  adequate 
to  prove  the  partial   correctness  of  this  algorithm. 

4.4.3     Heap  Sort 


Algorithm  9  [Floyd  1964]  imposes  the  structure  of  a  binary 
tree  on  the  array  to  sort  its  elements.  We  formulate  the  si  ft- up 
algorithm  recursively;  an  iterative  version  of  this  algorithm  is  not  prov- 
able using  our  partitioning  technique  (see  Section  4.5).  The  predicates 
ordt,  x  -  tree  (•,*)  are  defined  below: 


78 


f 

I 

X 

■c 


...1 

MM 


Kim 

up 

"HrlO 

es 


r  ■*•  1 ;  w  ■*•  1 ;  b  «-  n 
while  wS  b  do 

cases  color  (w)  of 
white:  w  «-  w  +  1 
red:    (exchange  t  with  t  ; 
r  +,r ;■+  1 ;  vt  + w  +  1) 
blue:   (exchange  x  with  x.  ; 
b  *■  b  -  1) 
end  cases 

*  red  (1 ,r-l )  and  white  (r,w-l )  and  blue  (b+1  ,n)  and 

l£r*wsb)£n+l 
endwhile 

*  red^  (l,r-l)  and  while  (r,w-l)  and  blue  (w,n) 

and  l<r<w=b  +  l<n  +  l 

Algorithm  8.  Dutch  National  Flag 
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x  ^  tree  (s,t) 


ordt  (s,t) 


x  >  ordt  (s,t) 
heap  (s,t) 


(s  £  t  and  x  £  x  and  x  ^  tree  (2s, t) 
v     u    s  u 

and  x  >  tree  (2s+l ,t) 

or  s  >  t) 

(s  <  t  and  x  ^  ordt  (2s, t) 

and  xs  >  ordt  (2s+l,t) 

or  s  >  t) 

(x  ^  tree  (s,t)  and  ordt  (s,t)) 

(s  <  t  and  heap  (s+1 ,t)  and  ordt  (s,t) 

or  s  ^  t) 


Since  our  interest  here  is  to  demonstrate  the  applicability  of 

the  principle  of  partitioning,  we  shall  take  the  liberty  of  simplifying 

the  verification  conditions.     A  crucial  verification  condition  of     si f tup- 
procedure  is: 


{j  =  2i  <  n  and  x-  <  x.   and  x.  -,   5  x.   and 
ordt  (2j,n)  and  ordt  (2j+l,n)  and 
x.  ^  tree   (j,n)  and  x-  ^  ordt  (j+l,n) 
[call      siftup  (j,n)| 
ordt  ( i , n ) } , 


(4.1) 
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procedure  si f tup  (i,n) 
*  ordt  (wi ,n)  and  ordt  (2i  +  1 ,  n) 
j  <-  2  *  i 
if  j  $  n  then 
if  j  <  n  then 


If  xi   <  x-j+i   then  j  +•  j  +  1  endif 


endif 


if  x-   <  x.  then 

J 


exchange  x^  with  x.; 


*  ordt  (2j,n)  and  ordt  (2j  +  1 ,  n)  and  x.  £  tree  (j,n) 


and  x.   *  ordt  ( j  +  1 ,  n 


call   siftup  (j,n) 
endif 
endif 

*  ordt  (i,n) 
endproc 

Algorithm  9(a).     Recursive  Siftup  Algorithm 
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procedure  heapsort  (n) 

for  i   n  div  2  down to  2  do 
call  siftup  (i ,n) 

*  heap  (i,n)  and  2  <  i  ^  n  drv  2 
endfor 

for  i  *■  n  downto  2  do 

call  siftup  (1  ,i ); 
exchange  x-,  with  x. 

*  heap  ( 2 ,  i - 1 )  and  array  ( 1 ,  i - 1 )  <  sorted  ( i , n )  and  <  i  £  n 
endfor 

*  sorted  (1 ,n) 
endproc 

Algorithm  9(b).  Heap  Sort 


82 


• 


assuming  that  si f tup  does  not  change  the  order  of  elements  in  any  tree 
(s,n)  unless  the  tree  is  a  subtree  of  tree  (i,n).  The  Lemma  (4.1), 
therefore,  reduces  to: 

j  =  2i  <  n  and 

ordt  (2j,n)  and  ordt  (2j+l,n)  and 

x.  ^  tree  (j,n)  and  x.  £  ordt  (j+l,n)  and 

ordt  (j,n) 

ordt  (i ,n) 


(4.2) 


where  call  siftup  (j,n)  has  added  ordt  (j,n).  The  relevant  partition 
of  the  "array"  is  not  decomposing  into  contiguous  array  segments  but  to 
decompose  the  tree  (i,n)  into  its  two  subtrees  tree  (2i,n)  and  tree 
(2i+l,n)  and  the  root  x. .  The  proof  of  (4.2)  requires  consideration  of 
three  cases:  2j  >  n,  2j  =  n,  and  2j  <  n.  To  demonstrate  the  use  of 
a  partition  of  the  above  type,  consider  the  most  interesting  case  2j  <  n 
We  can  rewrite  (4.2)  as: 

j    =  2i  <  n  and  2j  <  n  and 
ordt  (2j,n)  and  ordt  (2j+l,n)  and 

^  tree  (2j,n)  and  x-  ^  tree  (2j+l  ,n)  and  x-  ^  x-  and 

£  ordt  (j+1 ,n)  and 


xi 


xj 

1= 

x.- 


x2i 


xi 


£  ordt  (2j,n)  and  x.  >  ordt  (2j+l,n) 


(4.3) 


x?i  anc*  xi  *  ordt  (4i,n)  and  x.  >  ordt  (4i+l  ,n)  and 
ordt  (4i  ,n)  aj 
2     ordt  (2i+l,n) 


>  ordt  (4i,n)  ami  x~.  ^  ordt  (4i+l,n)  and 
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As  can  be  seen,  the  conclusion  follows  from  the  premise  if  2i  is 
substituted  for  j. 

The  verification  conditions  for  the  two  for- loops  of  heapsort 
(Algorithm  9(b))  require  even  more  complex  partitioning:  a  decom- 
position into  subtrees  as  well  as  into  array  segments  of  one-element. 
However,  an  iterative  version  of  the  siftup-algori thm  does  not  yield  to 
such  a  decomposition  of  the  heap,  and  hence  is  not  provable  by  our 
techniques. 

4.4.4  A  List  Moving  Algorithm 

Algorithm  10  [Reingold  1973,  Wagner  1974]  moves  all  nodes 
of  a  list  structure  accessible  from  a  root  to  a  new  contiguous  set 
of  nodes.  We  outline  a  proof  of  the  fact  that  what  is  copied  by  the 
algorithm  is  isomorphic  to  the  original  list  structure  composed  of  all, 
and  only,  those  nodes  accessible  from  the  root.  For  convenience  in  this 
proof,  we  have  introduced  the  tables  copyof  [•]  and  origof  [•],  and 
boolean  flags  copied  [•].  The  original  node,  origof  [q],  of  the  newly 
copied  node  q  is  not  required  by  the  algorithm  itself;  the  tables 
copied  [•]  and  copyof  [•]  may  be  overlapped  with  the  left  [•]  fields  of 
the  original  nodes  (see  Wagner  1974).  The  predicates  in  the  loop 
invariant  are  defined  below: 


isocopy  (q)  =  (q  =  0  or 

i so copy  (q-1)  and  data  [q]  =  data  [q  ]  and 
right  [q]  =  copyof  [right  [qQ]]  and 
left  [q]  -  copyof  [left]qQ]]) 
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procedure  movelist  (root) 
p  «-  0;  q  «•  0;  £  ■*-  root 
call  copy  (£) 
while  q   p  do 

q  +   q  +  1 

call  copy  (left  [q]) 

call  copy  (right  [q]) 

*  isocopy  (q)  and  q  to  p  and  p  from  q  and  dupe  (q,p) 
endwhile 

*  isocopy  (p)  and  p  to  p  and  p  from  p 
endproc 


procedure  copy  (var  x) 
vf  x  f   nil  then 
if  not  copied  [x] 
then  p  «*-  p  +  1 ; 

node  [p]  «-  node  [x]; 
copied  [x]  +■  true; 
copyof  [x]  +   p; 
origof  [p]  •*■  x 
endif 

x  ■*•   copyof  [x] 
endif 
endproc 


Algorithm  10 


A  List  Moving  Algorithm 
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where  q  =  origof  [q].  (The  nodes  1  through  q  constitute  an  isomorphic 
copy  of  a  substructure  of  the  original  list.) 
q  to  p 

=  (q  =  0  or 

q-1    to  p  and  left  [q)  £  p  and  right  [q]  <  p  or 
q-1   to  p-1   and  (right  [q]  £  p-1   and  left  [q]  =  p  or 
right  [q]  =  p  and  left  [q]  ^  p-1 )  or 
q-1   to  p-2  and  left  [q]  =  p-1   and  right  [q]  =  p) 
p  from  q 
=     (q  =  0  or  P  from  q-1   or 

p-1    from  q-1   and  (p  =  left  [q]  or  p  =  right  [q])  o_r 
p-2  from  q-1   and  p-1   =  left  [q]  and  p  =  right  [q]) 

(q  to  p  means  that  all   nodes  reachable  from  q  using  right-left  links 
are  included  in  1    ...   p.     Similarly,   p  from  q  denotes  the  converse, 
i.e.,  all   nodes  included  in  1    ...   p  are  reachable  from  nodes  in 
1    .    .    .   q  via  the  right-left  links.) 

dupe  (s,t)     =     (s  >  t  or  s  =  t  and  node  [s]  =  node   [origof[s]]  o_r 
for  some  u,  s  ^  u  <  t  and  dupe[s,u]  and 
dupe  [u+1 ,t]) 

(Nodes  from  s  to  t  are  exact  copies  of  their  original   nodes.) 


The  partition  of  the  copied  list  structure  as  indicated  by 
the  above  definitions  of  the  predicates  readily  gives  a  proof  of  various 
verification  conditions  of  the  list  moving  algorithm. 
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4.5     On  the  Applicability  of  Partitioning 


As  we  have  seen  in  the  examples  of  the  proceeding  section,  a 
class  of  programs  that  typically  have  loops   (recursive  calls)  operate 
on  their  data  objects  building  up  the  desired  property  iteratively 
(recursively).     Two  general   approaches  are  discernible  in  the  iterative 
build-up  of  properties: 

Al .     The  data  structure  having  a  desired  property  P  is 

gradually  built-up.     If  D  is  a  segment  of  the  data  object 
having  property  P,  we  find  6D,  an  incremental   part  from 
the  remaining  part  of  the  data  object.     The  composite 
segment  D  +  <5D  is  manipulated  so  that  D  +  6D  has  the 
property  P.     Repeat  the  process  until   all  of  the  data 
object  has  the  property  P  [Misra  1976]. 
A2.     The  desired  property  P  on  a  data  object  is  gradually  built- 
up.     If  D  has  a  property  Q,  we  manipulate  D  so  that  it 
now  has  property  Q     which  is  "closer"  to  P  than  Q  was. 
The  examples  of  Section  4.4.2  and  4.4.4  belong  to  class  Al . 
Partitioning  seems  applicable  to  all   such  programs.     It  is,  of  course, 
possible  to  describe  an  algorithm  belonging  to  class  Al    in  terms  of  A2. 
A  bubble  sorting  algorithm  can  be  thought  of  as  converting  an  array  that 
is  less-sorted  to  an  increasingly-sorted  array;  however,  the  algorithm 
is  best  put  in  class  Al .     On  the  other  hand,  there  are  algorithms  belong- 
ing to  class  A2  which  it  will   be  very  difficult  to  describe  in  terms 
of  Al .     A  nonrecursive  sift-up  algorithm  of  heap  sort  (see,  Floyd  1964 
and  Section  4.3)  descends  the  tree  confining  the  undesirable  property 
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that  some  tree  is  not  ordered  (ordt)  to  smaller  and  smaller  trees. 

This  algorithm  clearly  belong  to  Class  A2. 

Thus,  for  partitioning  to  be  applicable,  it  seems  necessary 

that  the  following  requirements  be  satisfied: 

Rl .  The  data  structures  used  must  have  disjoint  components. 
(Thus  circular  lists,  "trees"  with  shared  structures  do 
not  satisfy  this  requirement,  while  stacks,  queues, 
linear  lists,  trees,  tables  do.) 
R2.  It  should  be  possible  to  describe  the  property  P  on  data 
object  D  equivalently  in  terms  of  the  same  property  P  on 
components  of  D  obtained  by  a  finite  decomposition,  and 
possibly  some  interrelationships  among  the  components. 
(Properties  like  A  is  a  permutation  of  A  ,  are  not  thus 
partitionable,  while  those  like  T  is  an  AVL-tree,  array 
A  is  sorted,  or  array  A  is  a  heap  are.) 
R3.  The  property  P  being  sought  should  be  built-up  by  the 
algorithm  using  the  approach  Al . 

When  the  desired  property  P,  and  data  object  D  satisfy  requirements 

Rl  and  R2,  it  is  generally  possible  to  write  programs  that  satisfy  R3. 

Thus,  the  applicability  of  partitioning  depends  not  only  on  the  intrinsic 

properties  of  the  data  structure,  and  the  property  P,  but  also  on  how 

P  is  built-up. 


•  •• 

I 

I 

f 

.^1 


i 
■1 

|3i3 

.  Q  i,  J 

'  fi.iw 

I 

an 

g  inn 

a  is* 

5  >uvfC 

4  'It  All 

S'l    «' 

4  Mm 


88 


5.     SORTLAB 

The  verification  system  descirbed  in  Chapters  2  and  3  is  at 
the  heart  of  a  programming  laboratory,  called  SORTLAB,  which  assists 
the  student- programmer  in  producing  correct  sorting  algorithms  from 
basic  ideas  of  these  algorithms.  SORTLAB  consists  of  a  program  editor, 
an  interpreter,  the  program  verifier  described  earlier  and  a  counter- 
example generator.  These  are  implemented  on  the  PLATO  interactive  system 
as  a  "lesson."  This  lesson  is  a  part  of  the  Automated  Computer  Science 
Education  System  (ACSES)  developed  by  the  Department  of  Computer  Science 
of  the  University  of  Illinois. 

This  chapter  describes  SORTLAB,  its  use  and  its  implementation. 
Sections  5. land  5.2  provide  a  context  in  which  the  performance  of  SORTLAB 
should  be  evaluated. 

5.1  PLATO 

The  PLATO  IV  interactive  system  [Alpert  and  Bitzer  1970]  is 
designed  to  support  more  than  500  users  logged-in  on  the  plasma-panel 
graphic  terminals.     The  users  can  be  divided  into   "authors"  who  write 
teaching-programs   ("lessons"),  and  "students"  who  execute  these  lessons 
at  their  own  pace.      It  is  expected  that  a  user  limit  CPU  usage  to  2 
milliseconds/clock-second;  any  attempted  over-use  will   be  reduced  to 
this  level   by  offering  fewer  time-slices. 

Each  student-user  has  a  data  segment  of  1,650  60-bit  words.     A 
lesson  is  assigned  a  data  space  of  1,500  words  in  the  central  memory, 
and  it  can  access  these  1,500  words  and  the  first  150  words  of  student 
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data  segment.  The  1 ,500- word  space  must  be  loaded  (and  unloaded)  with 
the  contents  of  the  remaining  1,500  words  of  student  data  segment  or  of 
a  segment  of  extended  core  storage  containing  information  that  is  common 
to  all  users  executing  the  lesson.  Thus  any  lesson  using  more  than  150 
words  of  data  must  explicitly  control  this  "paging." 

The  single  most  annoying  factor  in  the  use  of  the  PLATO  system 
for  program  development  is  TUTOR,  the  only  programming  language  available 
to  authors,  in  which  the  lessons  are  to  be  written.  (For  a  short  intro- 
duction, see  Popular  Computing  1975;  a  detailed,  and  a  slightly  outdated 
description  may  be  found  in  [Sherwood  1975].)  TUTOR  is  a  high-level 
language  with  an  assembly-language-like  format.  It  contains  several 
machine-dependent  data  manipulative  statements  with  such  niceties  as 
nested  assignment  statements  and  generalized  versions  of  the  computed- 
goto  and  do-loop  statements  of  FORTRAN.  Procedure  blocks  may  be 
defined,  but  there  are  no  local  variables.  Each  variable  name  must  be 
assigned  an  address  by  the  programmer.  Several  variables  with  small 
values  may  be  assigned  to  different  segments  of  the  same  60-bit  word.  In 
addition  to  these  features,  there  are  several  statements  that  are  useful 
in  judging  the  students  response.  The  run-time  system  of  TUTOR  permits 
nested  procedure  calls  (recursive  or  not)  at  most  10  levels  deep.  Most 
lessons  written  for  PLATO  have  a  simple  structure;  for  these  programs, 
lack  of  control  structures,  local  variables,  etc.  are  not  serious  im- 
pediments. Typically,  such  lessons  also  use  little  CPU-time.  Most  stu- 
dents find  it  pleasant  to  "read"  such  lessons  because  of  the  near- 
instantaneous  response  and  excellent  graphics.  Any  unpleasantness  is 
usually  attributable  to  the  author's  style  of  writing  his  lesson. 
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5.2     ACSES 
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The  Department  of  Computer  Science  of  the  University  of  Illinois 
has  developed  on  PLATO  an  Automated  Computer  Science  Education  System 
[Nievergelt  1975]  for  beginning  students  in  computer  science.     It  con- 
sists of  a  large  body  of  lessons,  a  GUIDE  information  retrieval  and 
management  system  [Eland  1975]  and  an  interactive  programming  system 
[Wilcox  1973].     The  GUIDE  may  be  used  by  a  student  to  find  out  about 
his  records  or  to  choose  a  lesson  of  interest.     The  programming  system 
supports  several   languages  with  excellent  error  diagnostics.     The  body 
of  lessons  largely  consists  of  conventional   Computer  Assisted  Instruction 
lessons  about  various  aspects  of  computer  science.     Among  this  collection 
are  two  lessons  which  incorporate  novel   concepts  of  artificial   intelligence 
and  program  proving  adapted  to  run  on  limited  computer  resources: 
PATTIE  [Danielson  1975],  to  tutor  students  in  top-down  program  design; 
and  SORTLAB,   to  be  presented  in  the  next  section. 

53.     S0RTLAB--A  Programming  Laboratory 

SORTLAB  concerns  itself  with  the  implementation  of  certain 
sorting  algorithms.   It  provides  a  "laboratory"  wherein  a  student  can 
perform  programming  "experiments"   using  the  various  equipment  provided. 
It  does  not  actively  suggest  what  ways  should  be  used  in  implementing  an 
algorithm,  but  focuses  the  student's  attention  on  the  correctness  of 
his  program  by  providing  such  tools  as  specially-designed,  and  easy-to- 
learn  mini   programming  language,  an  excellent  program  editor,  a  program 
verifier,  a  counter-example  generator,  and  an  interpreter  for  his  programs. 
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Figure  5.1  Components  of  SORTLAB 


5.3.1  Programming  and  Assertion  Languages 


Interpreter 


The  languages  are  so  chosen  that  while  it  is  convenient  and 
natural  to  express  several  sorting  algorithms,  writing  other  programs 
is  not  easy.  The  particular  choice  of  basic  operations  in  the  program- 
ming language,  and  predicates  in  the  assertion  language  is  strongly 
influenced  by  decidability  considerations  (see  Section  2.2). 

A  program  example  is  given  in  Figure  5.2.  The  syntax  of  the 
languages  is  specified  in  Figures  5.3  and  5.4.  The  assertion  language 
semantics  is  specified  in  Section  3.1.1.  The  ptr  assignment,  while, 
if  and  call  statements  have  the  conventional  meaning.  The  semantics 
of  other  statements  of  the  programming  language  is  explained  in  the 
examples  below. 
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<procedure> 

<stmt  list> 
<stmt> 

<while> 
<scan> 

<if> 

<ptr-assign> 
<exchange> 
<insert> 
<call> 

<optional  out  par  var> 

<input  par> 

<optional  out  par  exp> 

<bool  exp> 

<pl-disjunct> 

<pl-predicate> 

<ptr  pred> 

<key  pred> 

<nerel> 

<rel> 

<updn  wi  th> 

<ptr  exp> 

<ptr  var> 


:=  procedure  <identifier><input  par> 

<optional  out  par  expxstmt  list>  endproc 
:=  {<stmt>}* 
:=  <ptr-assign>|<exchange>|<insert>|<call>| 

<while>|<scan>|<if> 
:=  while  <bool  exp>  do  <stmt  list>  endwhile 
:=  scan  <updn  withxptr  var>  from  <ptr  exp> 

to  <ptr  expxstmt  list>  endscan 
::=  if  <bool  exp>  then  <stmt  list>  else 
<stmt  list>  endif 
=  <ptr  var>  «-  <ptr  exp> 
=  exchange  x  <ptr  exp>  with  x  <ptr  exp> 
=  insert  x  <ptr  exp>  below  x  <ptr  exp> 
=  call  <proc  identifier  (<ptr  exp>,  <ptr  exp>) 

<optional  out  par  var> 
=  <empty>|*(<ptr  var>{,<ptr  var>}*) 
=  (<ptr  var>,  <ptr  var>) 
=  <empty>|*(<ptr  exp>{,<ptr  exp>}*) 
=  <pl-disjunct>  {ojr  <p"l-disjunct>}* 
=  <pl-predicate>  {and^  <pl -predicated* 
=  <ptr  pred>|<key  pred> 
=  <ptr  expxnerelxptr  exp> 
=  x  <ptr  exp><nerel>x<ptr  exp> 
=  <rel  >  |  =f= 

=<l  *  I  =  I  *  I  > 
=  up  with | down  with 

=  0|l|2|<ptr  var>|<ptr  var>  ±  1 

=  i  |  j  |  k  1 1 1  m  |  n 


Figure  5.3.  Syntax  of  the  Programming  Language 
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<assertion>         : 

:=  <disjunct>  {or  <disjunct>}* 

<disjunct>          : 

:=  <predicate>  {and  <predicate>}* 

<predicate>         : 

:=  <ptr  predicate>|<array  predicate> 

<ptr  predicate>       : 

:=  <ptr  exp>{<rel><ptr  exp>} 

<array  predicate>     : 

:=  sorted  <segment  def>|<segment> 

{<rel><segment>} 

<segment>           : 

:=  array  <segment  def>| sorted 

<segment  def>|2<  <ptr  exp> 

<segment  def>        : 

:=  (<lower  boundary>,  <upper  boundary>) 

<lower  boundary>      : 

:=  <bounda ry> 

<upper  boundary>      : 

:=  <boundary> 

<boundary>          : 

:=  <ptr  exp> 

Figure  5.4.  Syntax  of  the  Assertion  Language 
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The  statement 

scan  up  with  i  from  j  +  1  to  k  -  1 

<body> 
endscan 
is  equivalent  to 

i  *  j  +  1 

while  i  ^  k  -  1  do 
<body> 
i  «-  i  +  1 
endwhile 
The  loop  variable  i  of  the  scan  statement  is  not  considered  unmodifiable 
by  the  body. 


The  statement  "insert  xi  below  xj"  is  equivalent  to  the  following 
abstract  program: 

t  «-  x.j ;  p  +■  i 
i_f  i  -  j  then 

while  p  £  j  -  2  do  x    «-  x     , ;  p  «-  p  +  1   endwhile 
{circular  up  shift} 
else  while  p  <  j  +  1   do  x    +■  x     -, ;  p  +■  p  -  1   endwhile 

{circular  down  shift} 
end  if 
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A  program,  in  SORTLAB,  is  a  collection  of  procedures  and  it 
always  includes  the  main  procedure  "sort."  All  procedures  are  external 
and  may  be  recursive.  The  array  x  is  global  to  all  procedures;  indices 
are  always  local.  Thus,  the  only  way  a  procedure  may  receive  an  index 
value  is  by  receiving  it  as  a  (value)  parameter. 

Notice  that  apart  from  the  array  to  be  sorted  x,  and  ptr  vari- 
ables, no  temporary  variables  are  provided.  Two  padding  elements  x  ,  and 
x  +,  are  predefined  to  be  -»  and  +°°  respectively;  these  may  be  used 
as  sentinels.  Thus,  the  entry  and  exit  assertions  of  main  procedures 
sort  (n)  are: 

n  ^  1  and  x  <  array  (1  ,n)  <  x  , 

sorted  (1 ,n) 

5.3.2  Language  Recognizers 

The  tokens  of  the  programming  and  assertion  languages  are  so 
chosen  that  (except  for  if,  insert,  and  i  «•.  .  .)  they  can  be  recognized 
by  their  first  character.  As  soon  as  the  first  character  of  the  token 
is  typed,  the  statement  is  completed  as  far  as  possible  and  is  displayed. 
An  illegal  key-press  causes  it  to  be  flashed  and  is  ignored.  Thus,  in 
writing  the  following  statements  only  the  underlined  keys  need  be  pressed: 

scan  down  with  1_  from  H  "to  2 

endscan 

exchange  xi^l  with  xj+1 

5.3.3  Program  Editor 

Each  procedure  constitutes  a  "display  page,"  and  these  may  be 
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selected  by  typing  in  the  name  of  the  procedure.     A  statement  is  inserted 
by  first  giving  a  line  number  to  it  and  then  writing  the  statement.     An 
assertion  is  given  as  the  exit  assertion  of  a  statement;  the  assertion  is 
displayed  at  the  end  of  the  statement.     Thus,  the  line  labeled  16*  in 
Figure  5.2  is  the  exit  assertion  of  the  if -statement  at  line  11.     It  is 
also  the  loop  invariant  of  the  while-loop  at  line  4.     Any  sequence  of 
statements  can  be  deleted  and,  if  so  desired,  saved.    A  segment  from  among 
several  of  such  saved  program  segments  may  later  be  inserted  into  a  pro- 
cedure. 

Compound  statements  1  ike  the  while-statement  are  written  in  two 
steps:     first,  the  while-envelope  with  its  corresponding  endwhile  and 
without  a  body  is  written.     At  a  later  time,  the  body  is  formed  either 
as  a  sequence  of  new  statements,  or  by  inserting  a  saved  program  segment. 
Thus,  a  number  of  simple,  but  common,  errors,  like  unmatched  end-brackets 
of  statements,  unintentional   nesting  of  bodies  because  of  a  missing 
begin,  end,  or  semicolon,   do  not  arise.     Further,  structural   changes  of 
a  procedure  do  not  require  reparsing.     Every  structural   change  results 
in  a  new  page  displaying  the  updated  version,  with  automatic  indentation, 
of  the  procedure. 

A  number  of  ideas  incorpoarted  into  this  editor  are  originally 
due  to  [Hansen  1971  ]. 


5.3.4     Interpreter 


The  interpreter  can  execute  any  program  written  in  the  program- 
ming language.     The  assertions  are  also  executed,  and  their  truth 
value  at  run-time  is  indicated.     It  is  possible  to  execute  the  program 
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in  various  modes,  including  step-by-step  node.  During  execution,  the 
contents  of  the  array  being  sorted  is  dynamically  displayed  along  with 
the  location  of  various  indices  (Figure  5.2).  Only  the  currently  active 
procedure  are  displayed;  as  each  new  procedure  is  entered,  that  procedure 
is  displayed.  An  invocation  trace  is  also  displayed. 

The  interpreter  carefully  checks  for  all  possible  violations 
of  the  assumptions  made  by  the  verification  system:  Each  procedure  is 
assumed  to  permute  only  the  elements  of  the  array  segment  between  the  two 
imput  parameters  of  the  procedure  (1  is  an  "implicit"  input  parameter 
of  procedure  sort;  this  prevents  it  from  becoming  a  recursive  procedure 
since  each  call  statement  must  have  two  input  (actual)  parameters!).  The 
values  of  all  index  variables  should  be  between  0  and  n  +  1  where  n  is 
the  size  of  the  array;  once  an  index  variable  has  a  value  outside  this 
range,  it  is  not  possible  for  that  variable  to  have  a  legal  value. 

5.3.5  Sorting  Program  Verifier 

The  student  requests  that  his  program  be  verified  when  he 
has  completed  writing  it.  The  verifier  then  proceeds  to  verify  his 
program  provided  all  the  required  assertions  (an  invariant  for  each 
loop;  an  entry,  and  an  exit  assertion  for  each  procedure;  an  entry 
assertion  for  each  call  statement)  are  given.  The  process  of  verification 
is  not  interactive.  The  student  is  informed  only  of  the  outcome  of  the 
verification.  If  his  program  is  not  proven  correct,  the  lemmas  which 
were  false  are  indicated.  He  may  then  request  a  counterexample,  or 
proceed  directly  to  edit  his  program. 
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We  emphasize  that  when  a  program  is  not  proven  correct,  it 
may  be  because  strong  enough  assertions  were  not  given. 

5.3.6  Possible  Extensions  of  SORTLAB 


It  seems  possible  to  construct  a  "sorting  expert"  consisting 
of  such  components  as  loop  invariant  generator,  termination  prover, 
efficiency  analyzer,  elegance  judger,  and  algorithms  expert.  Systems 
similar  in  intent  to  these  subcomponents  have  been  designed  in  other 
contexts.  El  spas  [1973]  describes  how  the  efficiency  of  a  program  analyzed 
automatically,  a  by-product  being  termination.  Considerable  literature 
(see,  e.g.  [Wegbreit  1974])  has  appeared  on  the  automatic  generation 
of  loop  invariants.  Ruth  [1974]  discusses  a  system  which  attempts  to 
give  quality  feedback  to  the  student  using  built-in  knowledge  about 
specific  sorting  algorithms  like  bubble  sort  algorithm.  An  elegance 
judger  may  be  readily  constructed  if  that  elusive  characteristic, 
"elegance,"  of  a  program  is  quantified  in  terms  of  measurable  quantities 
like  the  length  of  the  proofs  of  correctness,  number  of  statements, 
variables  etc. 

The  tutoring  system  SORTLAB  would  certainly  be  more  attractive 
with  such  a  sorting  expert.  The  construction  of  this  component  seems 
doable,  but  is  another  project  of  same  magnitude  as  the  verifier. 
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6.  DISCUSSION 

Many  verifiers  have  been  constructed.  Yet,  none  of  them  can 
be  considered  a  tool  usable  by  ordinary  programmers.  The  number  and 
variety  of  programs  proven  is  small.  Data  structures  more  complex  than 
linear  arrays  or  lists  are  handled  unnaturally.  More  significant  is 
their  lack  of  performance  of  these  verifiers  in  terms  of  memory  space, 
and  computation  time  needed. 

This  failure  in  making  significant  advances  toward  constructing 
verifiers  that  are  mechanical  aids  to  program  writing  can  be  largely 
attributed  to  the  yery   attitude  taken  in  building  several  of  the  present 
day  verifiers.  They  all  seem  to  start  with  the  presumption:  Given  an 
arbitrary  program  with  assertions,  prove  it.  Evidence  is  building  up 
that  practically  usable  verifiers  cannot  be  constructed  unless  the  prob- 
lem domain  is  limited,  programs  are  well -composed,  abstract  data  struc- 
tures and  operations  are  used,  and  properties  of  programs  and  data 
structures  are  studied  from  a  semantic  viewpoint.  Thus,  we  foresee  not 
one  ultimate  program  verifier  but  a  class  of  limited  domain  program 
verifiers,  each  capable  of  proving/disproving  a  certain  class  of  programs. 

Section  6.1  elaborates  these  points.  Section  6.2  describes  a 
few  of  the  significant  verifiers  and  theorem  provers  built  so  far. 

6.1  A  Critique  of  Program  Verifiers 

McCarthy  [1963]  was  one  of  the  earliest  to  recognize  the  need 
to  replace  debugging  of  systems  (computer  programs,  engineering  systems, 
etc.)  by  proofs  that  systems  meet  their  specifications.  Considering 
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programs  as  mathematical  objects,  he  goes  on  to  show  how  statements 
about  programs  may  be  proven.  The  theory  developed  by  Floyd  [1967]  for 
iterative  programs  is  comprehensive  and  equates  the  correctness  of  the 
program  to  the  truthhood  of  a  certain  set  of  lemmas  generated  from  it. 

King  [1969]  constructed  a  verifier  which  mechanized  both  lemma 
generation  and  proof.  This  clearly  demonstrated  the  feasibility  of  an 
automatic  program  verifier  and  became  the  pilot  system  for  a  dozen  or 
so  systems  to  follow  (see  [London  1972]).  Many  of  these  verifiers  are 
the  result  of  unfortunate  marriages  between  a  lemma  generator  and  a 
classic  automatic  theorem  prover,  and  none  can  be  considered  to  be  sig- 
nificantly superior  to  King's  verifier. 

6.1.1  Theorem  Provers  for  Program  Verifiers 


Work  on  classic  theorem  proving  always  concerned  itself  with 

the  general  problem  of  syntactically  deducing  that  a  given  statement  of 

first-order  logic  follows  from  a  set  of  axions  (see,  e.g.,  [Chang  and 

Lee  1974],  and  [Bledsoe  1975]).  Pointing  out  some  of  the  theoretical 

impediments  to  automatic  theorem  proving,  Rabin  [1974]  comments  that 

this  work  had  such  high  hopes  and  aims  as: 

.   .   .to  develop  a  theorem  prover  which  will  enable 
them  to  solve  mathematical   problems,  and  hopefully 
even  difficult  mathematical  problems,  by  the  com- 
puter.    If  one  wants  to  slide  into  the  realm  of 
science  fiction  then  one  may  talk  about  proving  or 
disproving  Fermat's  conjecture  by  an  automated 
theorem  proving  program.    .   .   . 

Since  first-order  logic  is  undecidable,  one  is  looking  only  for  efficient 

semi-decision  procedures  which  will   produce  proofs  of  statements  which 
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are  theorems  and  halt,  and  which  may  not  halt  on  nontheorems.  But,  as 
Rabin  makes  it  plain,  even  in  such  theoretically  decidable  domains  as 
Pressburger  Arithmetic  (first-order  sentences  involving  natural  numbers 
and  the  operation  of  addition  only),  to  computationally  determine  if  a 
given  sentence  is  true  or  false  may  be  practically  undecidable. 

If  verification  is  ever  to  replace  debugging,  verifiers  should 
be  able  to  handle  incorrect  programs.  That  is,  we  need  theorem  provers 
which  are  decision  procedures  for  the  lemmas  generated.  Thus,  the  pro- 
grams that  a  verifier  attempts  to  prove  or  disprove  should  be  so  limited 
that  the  lemmas  generated  belong  to  a  decidable  domain.  This  can  be 
done  only  by  carefully  designing  a  language  for  assertions  expressive 
enough  to  allow  all  "legitimate"  assertions  one  might  want  to  make  in 
proving  properties  of  programs  from  an  interesting  class  of  programs. 
The  theorem  prover  should  then  be  a  decision  procedure  for  all  sentences 
in  the  assertion  language. 

Since  even  decision  procedures  may  take  impractical ly  long  to 
decide  if  a  sentence  is  true  or  false,  they  should  be  so  engineered  that 
for  a  large  subset  of  the  lemmas  that  can  be  considered  to  be  "naturally 
occurring"  in  well-designed  programs  such  decisions  are  made  rapidly. 
Thus,  we  may  not  mind  if  it  takes  super-exponential  time  to  decide  if  a 
verification  condition  of  the  following  kind 

{  n  | 
i  *■  i 
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is  correct  (because  the  programmer  has  the  bad  manners  of  misusing  the 
verifier  to  prove  an  irrelevant  mathematical  theorem  that  n  implies  w) 
so  long  as  the  verifier  gives  correctness  proofs  of  legitimate  programs 
quickly. 

Furthermore,  the  lemmas  generated  in  proving  well -designed, 
legitimate  programs  are  not  typical  of  manual  mathematics.  These  lemmas 
are  shallow  and  follow  fairly  directly  from  (properly  chosen)  axioms  and 
inference  rules.  Clearly,  it  is  impractical  to  include  all  lemmas  to 
be  proven  as  the  set  of  inference  rules;  a  small  number  of  inference 
rules  should  be  carefully  tailored  so  that  short  proofs  of  naturally 
occurring  lemmas  can  be  given  rapidly.  Two  examples  of  theorem  provers 
so  designed  are  [King  and  Floyd  1972]  and  the  theorem  prover  described 
in  Chapter  3  of  this  thesis. 


6.1.2  Effect  of  Program  Composition 


The  structure  and  statements  of  a  program  clearly  will  have  an 
effect  on  its  verification.  Writing  abstract  programs  using  abstract 
data  structures  has  been  advocated  by  such  authors  as  Dijkstra  and  Hoare. 
The  solution  to  a  programming  problem  is  constructed  using  operations  on 
data  structures  that  are  natural  to  the  problem.  These  operations  and 
data  structures  will  then  be  written  at  a  lower  level  of  abstraction,  and 
so  on,  until  all  operations  and  abstract  data  structures  are  implemented 
in  the  host  programming  language.  The  advantages  of  such  an  approach 
lie  in  the  factorization  of  detail  at  any  given  level  of  abstraction. 


104 


-• 

go 

'  S;|W» 
JW3 


;  i  ■ 

in 

3& 


Such  abstraction  is  helpful  not  only  to  the  human  designer  of 
the  program,  but  also  to  the  program  verifier.  When  data  structures  are 
manipulated  solely  through  designated  procedures,  properties  related  to 
data  integrity  can  be  proven  by  considering  these  procedures  independent- 
ly of  their  invocations  using  generator  induction  [Hoare  1972].  Thus, 
for  example,  that  a  sorting  algorithm  has  only  permuted  the  given  ordered 
set  of  elements  can  be  shown  by  proving  that  the  primitive  operations 
exchange  and  insert  were  element-conserving. 

Another  important  advantage  to  be  gained  is  that  undecidable 
domains  of  lemmas  may  be  isolated  in  a  program.  Arithmetic  operations 
such  as  multiplication,  division  and  addition  which  result  in  theore- 
tically or  practically  undecidable  domains  can  be  grouped  together  and 
their  input/output  relationships  explicitly  given.  These  relationships 
may  then  be  proven  separately  by  ad  hoc  techniques.  Often,  such  arith- 
metic is  not  essential  to  the  property  of  the  program  being  proven.  For 
example,  the  division  by  2  in  binary  search,  and  multiplication  by  2  in 
siftup  of  heap  sort  are  not  essential  to  the  correctness  proofs.  The 
only  thing  that  matters  for  the  correctness  of  the  search  is  that  the 
interval  of  uncertainty  be  partitioned  into  two  smaller  subintervals. 

These  operations  on  data  structures  are  generally  implemented 
as  procedures.  Only  selected  components  of  a  data  structure  are  modi- 
fied by  the  procedures,  keeping  the  remaining  environment  of  the  procedure 
intact.  However,  the  rules  of  inference  about  procedure  calls  such  as 
those  given  in  [Hoare  1973]  or  in  [Elspas  et  al .  1973]  deal  only  with 
"entire  variables"  (a  whole  array,  a  whole  stack,  etc.)  and  are  weaker 
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than  they  should  be.  That  is,  correct  programs  exist  which  cannot  be 

proven  using  such  inference  rules.  A  "predicate  transformer"  (a  la 

[Dijkstra  1976])  offers  a  solution  to  this  problem. 

The  rule  of  procedure  invocation  of  [Hoare  1973]  can  be  roughly 

described  as  follows: 

Let  Q  be  a  procedure  whose  correctness  with  respect  to 
<J>  and  iJj  has  been  established  independently,   i.e., 


{<f>  I    Q  I   4>} 


Then  to  prove  {a  [call  Q|  3)  verify  the  following: 


and 


a  |=  <J> 


*     1=  3 


i  i 

where  <j>     and  ty    are  obtained  from  <j>,  and  ^  with  appropriate 

substitutions  made  for  the  formal   parameters  of  Q. 
Clearly,  this  rule  is  sufficient  to  prove  {alcall   Q|$}.     But  the  exit 
assertion  ^  of  Q  cannot,  in  general,  contain  enough  information  to  imply 
3  when  Q  is  called  under  different  input  environments,  all  of  them 
satisfying  <|>  .     A  number  of  properties  guaranteed  by  a  may  be  unchanged 
by  Q,  and  hence  true  upon  exiting  Q.     What  is  needed  is  a  meta-operator 
which  produces  a  3     as  the  transformations  made  by  Q  on  a  when  a  implies 
4>  .     Such  an  operator  in  the  context  of  backward  substitution  is  a  "pre- 
dicate transformer,"  transforming  the  given  exit  assertion  3  of  the 

i 
call  Q  into  a  ,  which  is  the  weakest  entry  condition  to  call  Q  such  that 

3  is  true  if  and  when  call  Q  returns. 
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The  verifier  should  be  given  a  predicate  transformer  for  each 
procedure  Q  which  may  be  invoked  under  varying  circumstances.     However, 
if  the  procedure  Q  is  not  well-written   (e.g.,  global   variables  were  used 
where  local   variables  should  have  been  used),  the  predicate  transformer 
will   be  an  overspecifi cation  of  Q.     It  should  also  be  realized  that 
some  procedures  are  called  only  in  certain  contexts.     In  such  cases, 
Hoare's  rule  is  simpler  to  use. 

6.1.3     Proving  Certain  Properties  of  Programs 

It  is  not  difficult  to  invent  innocent-looking  programs  whose 
correctness  is  \/ery  difficult  to  establish.     Pure  and  deep  mathematical 
results  may  be  used  in  the  program  and  hence  there  may  not  be  a  "directly 
perceivable"  relation  between  what  is  being  computed  and  the  stated  in- 
tentions of  the  program. 

For  example,  a  depth-first  search  algorithm  [Tarjan  1972]  com- 
putes certain  simple  functions  NUMBER(«)  and  L0WPT(«)  on  vertices,  and 
deletes  all  edges  from  a  stack  until   a  certain  condition  on  NUMBER(«)   is 
satisfied.     This  property  is  quite  obvious  to  prove.     That  this  set  of 
edges  constitutes  a  bi connected  component  of  the  graph,  however,  is  a 
difficult  theorem.     It  is  interesting  to  note  that  this  and  several 
other  graph  algorithms  use  very  simple  arithmetic  (successor  function  +1 , 
and  <  relation).     Habermann  [1975]  gives  another  example  of  an  al- 
gorithm (a  quadratic-hash  algorithm)  whose  correctness  proof  does  not 
readily  follow  from  the  program  structure  itself. 
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"Existential"  properties  are  also  quite  difficult  to  prove 
using  the  inductive  assertion  approach.     Consider,  for  example,  an  al- 
gorithm enumerating  all   circuits  of  a  graph.     Its  exit  assertion  is: 

Every  subgraph  g  (of  the  given  graph  G)  that  is  output 
is  a  circuit  of  G,  and  conversely,  every  circuit  of 
G  is  output. 

As  another  example,  consider  a  shortest  path  algorithm.     The 
exit  assertion  is: 

The  graph  G  has  no  path  shorter  than  the  one  found 
by  the  algorithm. 

The  path  p  found  by  the  algorithm  often  appears  explicitly  in  the  al- 
gorithm, while  the  set  of  all  paths  of  G  that  p  -is  being  compared  to  does 
not. 

6.2     Previous  Work  Related  to  This  Thesis 

In  a  survey,  London  [1972]  reports  that  there  are  more  than  a 
dozen  verifiers  constructed  so  far,  most  of  these  using  the  inductive 
assertion  method.     None  of  these  verifiers  can,  in  general,  handle  incor- 
rect programs.     Only  algorithms  that  were  known  to  be  correct  a  priori 
have  been  mechanically  verified  with  varying  degrees  of  human  interven- 
tion in  their  proofs. 

We  briefly  describe  two  of  these  verifiers—King's  and  SRI  — 
which  have  influenced  the  verifier  presented  in  this  thesis.     Other 
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significant  verifiers  include  [Luckham  et  al.   1973],[Deutsch  1973], 
[Boyer  and  Moore  1975],   [Good  et  al.   1975]  and  [Marmier  1975],     Cooper 
[1975]  discusses  independently  some  ideas  similar  to  those  expressed  in 
Chapter  3. 

6.2.1      King's  Verifier 

King  [1969]  constructed  a  verifier  which  mechanized  both  the 
lemma  generation,  and  their  proof.     A  commendable  engineering  approach 
was  taken  in  tailoring  the  theorem  prover.     The  programs,  and  hence  the 
lemmas,  were  limited  to  integer-valued  variables,  including  linear  ar- 
rays.    Several   ad  hoc  techniques  which  depend  on  the  detailed  knowledge 
of  integer  expressions  are  used  in  proving  a  large  class  of  lemmas 
about  integers.     The  premise  and  the  negation  of  the  conclusion  of  the 
lemma  to  be  proven  are  represented  in  a  "normal"  form,  and  the  resulting 
set  of  linear  inequalities,  and  nonlinear  equations  is  algebraically 
solved  [King  and  Floyd  1972]. 

Among  the  programs  that  King's  verifier  has  proven,  without 
any  human  intervention,  are:  simple  insertion  sort,  bubble  sort,  and 
computing  x     using  the  binary  representation  of  y. 

Subsequent  verifiers    ([Elspas  et  al .   1973],   [Luckham  et  al . 
1973],   [Good  et  al.   1975],   [Deutsch  1973])  have  provided  for  interac- 
tion with  the  user  in  attempt     to  prove  a  much  larger  class  of  pro- 
grams,  resulting  in  the  proofs  of  such  programs  as  Hoare's  FIND. 
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6.2.2     SRI  Verifier 


The  theorem  prover   [Elspas  et  al .   1973]  is  a  collection  of 
inference  rules  together  with  a  set  of  strategies.     Given  the  premise 
of  a  verification  condition  to  be  proven,  determining  whether  it  implies 
the  conclusion  proceeds  in  a  goal -driven  manner.     The  theorem  prover  has 
several   high-level   inference  rules  about  arrays.     Unfortunately,  the 
theorem  prover  is  embedded  in  a  disastrously  general   QA4  system  [Rulifson 
1972],  and  lacks  a  sense  of  direction.     At  any  given  point,  several   in- 
ference rules  are  applicable,  and  the  system  applies  each  one  in  turn 
until   it  succeeds  in  proving  the  goal  or  exhausts  all   inference  rules 
when,  of  course,  the  lemma  is  false.     However,  it  should  be  noted  that 
the    application  of  an  inference  rule  may  generate  further  instances  of 
application  for  another  rule,  and  vice  versa,  resulting  in  thrashing. 
The  user  may  be  called  upon  to  provide  advice  on  such  and  other  occasions 
which  can  then  alter  the  course  of  deduction., 

Both  King's  verifier,  and  the  SRI  verifier  handle  arrays  unsat- 
isfactorily, using  the  equivalent  of  access  and  change  functions 
of  McCarthy  [1967]  because  array  elements  are  considered  to  be  of  the 
same  type  as  their  indices,  and  interassignments  between  them  are  allowed. 

Our  own  inference  rules  about  arrays  (see  Chapter  3)  may  be 
considered  as  refinements  of  the  rules  in  the  SRI  verifier. 

6.3     Salient  Features  of  the  Sorting  Program  Verifier 

The  verifier  presented  in  this  thesis  has  been  designed  to 
meet  specific  performance  requirements.     It  was  to  be  usable  in  an 
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interactive  computing  system  which  imposed  severe  constraints  on  both 
the  amount  of  memory  and  computation  time  that  can  be  used  (see  Section 
501).  This  section  briefly  analyzes  the  factors  that  contributed  to 
the  fast  decision  procedure,  and  notes  some  of  its  shortcomings. 

6.3.1  Decidable 


The  verifier  presented  here  is  unique  in  that  it  is  the  only 
verifier  with  a  decision  procedure  for  the  verification  conditions  of 
the  programs  it  accepts  to  verify.  It  makes  no  pretense  of  being  general,, 
The  syntax  of  the  input  programs  has  been  carefully  designed  to  reject 
all  programs  that  the  verifier  cannot  prove  or  disprove.  It  provides 
two  basic  operations,  exchange  and  insert,  to  permute  the  elements  of  the 
array,  thereby  guaranteeing  that  the  elements  of  the  array  are  conserved. 
The  assertion  language  is  just  powerful  enough  to  express  all  the  asser- 
tions that  may  be  made  about  sorting-type  algorithms.  The  basic  predi- 
cates provided  capture  the  notion  of  sequential  access  in  sorting  algorithms. 

The  decidability  is  due  to  such  restriction  of  the  lemmas 
generated,  and  the  partitionability  of  the  sequentially  accessed  array 
structure.  This  results  in  a  canonical  representation  for  each  lemma  to 
be  proven.  The  rule  of  local  implication  lets  us  decide  if  a  given  pre- 
dicate is  implied  by  the  hypothesis  without  any  search.  At  no  time  does 
our  theorem  prover  need  to  backtrack  or  consider  various  inference  rules 
for  their  applicability. 


6.3.2  Fast 


The  theorem  prover  is  not  only  a  decision  procedure,  but  gives 


in 


these  decisions  rapidly  for  most  theorems  encountered  in  proving  sorting 
algorithms.  It  should  be  noted  that  loop  invariants  of  most  algorithms 
(not  necessarily  sorting)  are  conjunctions  of  predicates.  This  theorem 
prover  is  specially  suited  to  prove  such  theorems  by  natural  deduction. 
It  might  appear  that  a  large  number  of  linear  orderings  of  boundaries 
will  be  considered  in  the  proof  of  a  lemma;  however,  if  the  algorithm 
is  well -written  this  is  generally  not  the  case.  Such  lack  of  information 
about  how  the  boundaries  are  ordered  is  not  typical  of  sorting  algorithms, 

Two  factors  contributing  to  the  speed  of  the  theorem  prover 
are  the  large  inferences  made  about  array  segments,  without  considering 
their  individual  elements,  and  the  rule  of  local  implication. 

6.3.3  Backward  Function  Evaluation 

The  backward  function  evaluation,  in  the  context  provided  by 
the  ptr  expressions  which  constrain  the  boundaries  of  array  segments, 
considerably  simplifies  a  given  lemma.  This  completely  eliminates  the 
need  for  such  pseudo-functions  as  access,  and  change  of  McCarthy,  used  in 
nearly  all  other  verifiers.  It  is  important  to  realize  that  such  con- 
textual evaluation  is  valid  only  if  assignments  among  array  indices  and 
elements  are  not  permitted. 

6.3.4  Counterexample  Generation 


We  consider  the  generation  of  counterexamples  one  of  the  most 
important  duties  of  a  program  verifier.  If  debugging  is  ever  to  be 
replaced  by  verification,  incorrect  programs  must  be  handled  by  verifiers 
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by  either  suggesting  corrective  actions,   indicating  the  unproven  verifi- 
cation condition,  or  actually  generating  a  counterexample  for  the  skeptic, 

As  shown  in  Chapter  3,  a  modified  shortest-path  algorithm  is 
the  counterexample  generator  used  by  this  verifier. 

6.3.5  Some  Shortcomings 

It  is  interesting  to  note  that  the  theorem  prover  is  not  goal 
oriented.  Thus,  in  proving  even  a  trivial  theorem  such  as 

sorted  (l,n)  f=  sorted  (l,n) 

it  considers  two  partitions  (one  for  each  of  the  cases  n  £  0  and  n  >  0) 
of  the  array.  This  is  typical  of  decision  procedures  in  that  they  may 
ignore  shortcuts.  However,  the  strength  of  our  decision  procedure  is  in 
its  orientation  toward  naturally  occurring  theorems. 

More  seriously,  it  is  hard  to  generalize  the  theorem  prover. 
For  example,  if  we  permit  the  predicate  that  all  keys  of  an  array  seg- 
ment are  distinct,  the  theorem  prover  cannot  be  extended  in  a  straight- 
forward manner. 


6.4  Conclusion 


SORTLAB  shows  that  verifiers  for  programs  from  a  limited  do- 
main of  application,  which  incorporate  some  of  the  semantics  of  the 
domain,  are  practical.  It  would  be  interesting  to  see  an  approach  similar 
to  that  described  in  this  thesis  tried  for  another  domain  that  is  well- 
understood  and  easily  formalized  mathematically. 
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We  believe  that  such  limited  program  verifiers  will  be  the 
trend  of  the  future,  in  the  wake  of  recent  results  in  practical  unde- 
cidability  and  the  lack  of  progress  in  mechanical  program  verification 
in  general  „ 
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APPENDIX 
Performance  of  the  Verifier  -  An  Example 

The  following  selection  sort  program  has  a  weak  assertion 


1       procedure     sort  (n) 
*      TRUE 

scan  up  with  i   from  1   to  n-1 
scan  up  with  j  from  i  +  1   to  n 
if  xi   >  xj   then 

exchange  xi  with  xj 
else 
en  di  f 

S(1,I)   <  XI  <  A(I+1,J)   &  1   <_  I   <  j  < 
ends  can 

S(1,I)    <  A(I+1,N)   &  1    <  I    <  N 
endscan 


*      S(1,N) 

10     en  dp  roc 

The  theorem  prover  disproves   the  corresponding  verification  condition, 
(subst  j-1   for  j  vn_  7*)   and  j  <  n 

stnts   [4. ..7]     b     (7*) 

in  1114  CPU-milliseconds.     When   the  assertion  at  7*  is  given  as: 

S(1,I-1)   <  A(I,N)   &  XI  <A(I+1,J)   &  1    <  I   <  j  <_N 

the  program  is  proven  correct  in  9346  CPU-milliseconds. 
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